Zachary Luckabaugh

2/7-2/13 (Second Semester logbook)

Our project is to work on the future circular electron positron calorimeter. The purpose of the project is to use simulations to develop a new calorimeter specialized for measuring the energy of hadrons. The fundamental goal is to decrease the resolution (sigma) as small as possible. The resolution is determined by the formula

σ/E=s/√E⊕n/E⊕c

Luckily, the largest term, c is controllable by designing a better calorimeter.

Detection has an "active" scintillator part, and a "passive" material part. The passive part creates the particle shower and the active part measures the number of particles created.

Putting more scintillators makes the measurement more accurate, but introduces the possibility of measuring the same particle twice. For the passive material, atoms with high atomic masses create more interactions, and increase the reliability of showers being produced in an area it can easily be detected.

Ideally, a calorimeter also has a low temporal resolution, meaning it can distinguish between showers created by one particle or two.

Simulation is done on the Tier 3 computing cluster with the GEANT library (Though it seems to work on CMS VM too?).

2/14-2/20

The next step is to run a sample simulation to get used to using GEANT. The procedure to run the test is here:

mkdir junk

cd junk

cp /home/eno/dualReadout/cepc_calotiming/g4env.sh .

source g4env.sh

cp -r /cvmfs/geant4.cern.ch/geant4/10.5/share/examples/basic/B4/B4a .

mkdir build

cd build

cmake -DGeant4_DIR=/cvmfs/geant4.cern.ch/geant4/10.5/x86_64-slc6-gcc63-opt/lib64/GEANT4-10.5.0 ../B4a

make

./exampleB4a

(be a little patient… the x11 connection is slow)

Next to the “session” prompt in the window it pops up

/run/beamOn 1

It generates a (mostly) useless root file, and produces the following picture, representing a particle shower being made and detected:

When multiplying the initial energy by 10 from 50MeV to 500MeV, the shower gets more intense. The rays exiting the calorimeter demonstrate the constant error term C, which grows at these higher energies. Out of the total 500MeV, 461MeV was detected (92%)

The general flow of using GEANT 4:

2/21-2/27

Continuing with the tutorial, we are given a sample calorimeter to get an idea of what kind of output to expect. We perform a run with 1000 1GeV electrons to see the average deposited energy and resolution. The code by default records energy deposited and free path (Mean free path) in Eabs and Labs histograms.

When running at 1Gev, the mean energy deposited is 905MeV with a resolution of 19.69MeV. This means it should use an ajustment factor of 1.104 (for electrons) and the resolution is boosted to 21.74MeV. This factor means not only should detection be linear, but as much energy as possible should be absorbed.

2/28-3/6

This week, we moved to trying to combine the simulations together, and make a resolution/energy by energy plot. An expected plot would start high at low energies, and drop to the constant term for higher energies. The program used to generate the plot was used previously to generate this graph:

A similar file should be generated as the output of running two new programs: CC_Analyzer, and res. However, the Analyzer is not specified to work with the output of the root files generated in the tutorial, so it has to be customized. So far, The main tree has been changed so it is reading the histograms correctly, but it has an incorrect list of histograms to read as input.

The correct path is B4>row_wise_branch>{Eabs, Egap, Labs, Lgap}

Also, the Analyzer file uses a strange system where local variables are loaded by the callee through pointers. There is never a reason to do this. It almost looks as bad as the code I write.

In the following picture, the file component is inaccurately labeled as "Tree"

Ignoring this analysis file for the time being, I altered res.C to accept input directly from the B4 histograms, but the output doesn't quite match. The problem is not in the simulation because opening the histograms in root shows a standard deviation value in line with what is expected from the prediction curve, but the line generated graphically does not match.

3/7-3/27

Not much was accomplished over this period due to getting prepared for and used to https://en.wikipedia.org/wiki/2019%E2%80%9320_coronavirus_pandemic

Other than that, I found that the reason for the wacky data was the fact that the analysis phase was incomplete. The most important information is held in the Egap histogram, which measures the energy deposited in the scintillator, whereas Eabs is deposited in the passive material. The RMS of this detector is much higher than previously stated because I accidentally switched the two histograms. The real way to calculate RMS is first taking the average value of Etrue / Egap and multiplying each Egap entry by that number. The standard deviation of the resultant distribution is the resolution.

After realizing this, I went back to work on the analyzer code, commented out most of the references to unused histograms, and focused on Egap and Eabs, as those are the only available datapoints. Egap approximates Etrue well enough for large detectors, but it's not something you can measure in a real experiment. Unfortunately, the root file I was working with had a weird structure which made it hard to parse the original event tuples without getting either no data or a null pointer error. I am still working on this problem.

3/28-4/3

The bug causing loss of access to the histograms was fixed. Apparently accessing subfolders required use of a method I did not know about. The same simulations as before were run at 20~80MEV and were successfully analyzed. Unfortunately, when it came to fitting a gaussian distribution, the mean was calculated to be negative several times. This is not because the simulation generated negative energy, but rather because the graph followed more of a power law with several events clustering at 0 energy. This made it look like the tail end of a distribution that continued into the negatives. Simply increasing the energy to 1~4GeV was enough to alleviate this problem. The following is a sample histogram from 3GeV. "ajusted" means it represents the real energy detected multiplied by a correction factor.

A gaussian curve can fit this with a reasonable mean, so it can be plotted into a RMS / Energy plot like the one before. The axis has been changed a bit to represent higher energies.

so at 3GeV, the resolution is 25% or about 750MeV

4/3 - present

There is another plot that can be used to analyze the effectiveness of estimating energy using a calorimeter. That is the estimate / real energy plot. Up until now, each histogram had its entries corrected by a factor meant to center it around the true energy. However, practically speaking, this factor is only calculated once for a single, usually high energy, then applied to all energy levels. This means that the calorimeter could be inherently better at detecting high energy particles (which they usually are), and those factors do not apply well to lower energies. I ran a few simulations and the results matched expectations: Electrons, being easy to detect and measure, did not have much variation, even when the correction factor is limited. However, pions did have some variation, even when the calorimeter was made 10 times the original thickness.

(please ignore the trend lines. They are completely useless and I don't know how to get rid of them)

It seemed that most of the variation came below 40GeV, so I decided to focus new simulations specifically on that range. Unfortunately, I once again run into the problem that low energies cause gaussian curves to fit poorly, resulting in a spike on the left. Other than that, the pions specifically show more variation than the electrons. I am confused about how the statistically anomaly seemed to apply more to the electrons than the pions. There also seemed to be a degree of overfitting as there should be no bump between 20 and 50Gev. This was simply the range that was not sampled.

It turned out that the bug in the previous code was caused by the code I used to assign means. At 2Gev, 3Gev was assigned to the mean, etc. This was fixed, but a new problem cropped up with electrons from 30-49GeV (once again, a programming error) however, that part of the graph is not extremely important as the electron energy is very predictable, and the pions are accurate.

Pions do seem to have an issue when below 4Gev, but they do mostly satisfy the predicted upward trend.

[I put a huge thing about how all these experiments were redone with dualtoy, but google decided to delete that for me and I'm not retyping it all so sorry if this is tiny with only 2 graphs.]

we redid all the resolution things for dualtoy (can be found on Dr. Eno's github) and found that it had a resolution worse than that of B4a because its interaction length ratio was like 1:200 ( bad) instead of 1:5 (better). The results from cherenkov radiation in the quartz did not have any better resolution. We also graphed a plot of scintillator energy vs cherenkov photons and found they were both correlated, likely because of initial shower size.

___________________________________________________________________________________________________________________________

I am used to using linux, so I know most of these commands (but ls -l was a bit new to me).

++ adds 1 to the variable and -- subtracts 1. ++n adds 1 to n then uses its value while n++ uses the value of n first. C++ has its name because it is more than just C (C+1).

Code running from HW4:

#include <iostream>

using namespace std;

int main() {

cout <<"Hello World!" << endl; //Print hello world to screen followed by end line (endl)

//variables

int i=2;//put the number 2 on the stack for later use

cout << "i = "<<i<<endl; //print i = 2 with a newline

double a=3.3;//put the double precision value 3.3 onto the stack

cout << "a = " <<a<<endl; //print a = 3.3

int j = a*i; //makes 6.6 then cast to the int 6

cout << "a*i = "<<j<<endl;// print a*i = 6

int n=10;//set n to 10

cout << "n is "<<n<<endl;//prints n is 1

n--;//decrease n to 9

cout<<"n is now "<<n<<endl;//prints n is now 9

n++;//increase n to 10

cout<<"n is now "<<n<<endl;//print n is now 10

//non-numeric variables

bool prop;//new boolean value

prop = (5>1);//set prop to true (because 5 is more than 1)

cout<<"prop is "<<prop<<endl;//print prop is true

prop = (1>5);//set prop to false because 5 is not less than 1

cout<<"prop is "<<prop<<endl;//print prop is false

prop = (1 != 5);//prop is true because 1 is not 5

cout << "prop is " <<prop<<endl;//print prop is true

n=10;//set n (loop counter) to 10

while(n>0) {//this loop prints "n is 10" "n is 9" "n is 8" ... until "n is 1" then stops.

cout<<"n is "<<n<<endl;

n--;

}

{

//rewriting of the while loop code

for(int n=0;n<10;++n){

//slow loop

cout<< "n is "<<n<<": ";

for(int m=0;m<=n;m++){

//inner loop

cout<<m;

//this bit will write n is 0: 0 then n is 1: 01 then n is 2: 012, counting up to 9: 0123456789

}

cout<<endl;

}

//debugging

int idebug=1;

int super_secret_value = 0x4f;//who knows what this is?

if(idebug)

{

cout<<super_secret_value<<endl;//oh, it's 79

}

return 0;//end program

}

Result when running:

Hello World!

i = 2

a = 3.3

a*i = 6

n is 10

n is now 9

n is now 10

prop is 1

prop is 0

prop is 1

n is 10

n is 9

n is 8

n is 7

n is 6

n is 5

n is 4

n is 3

n is 2

n is 1

n is 0: 0

n is 1: 01

n is 2: 012

n is 3: 0123

n is 4: 01234

n is 5: 012345

n is 6: 0123456

n is 7: 01234567

n is 8: 012345678

n is 9: 0123456789

HW5:

#include <iostream>

using namespace std;

int main() {

int n = 10;

while (n>=10) { //only runs while n is 10 or above

if(n>5) {

cout<<"n is "<<n<<endl;//prints "n is 10" once

}

else {

cout<<"n = "<<n<<endl;//never runs

n--;

}

return 0;//immediately quit after first loop

}

The first bit of code simply prints "n is 10" because the decrement is inside the else statement, the check is >=, and the return is inside the while loop. If these statements are moved out, and the check is changed to >=0 it prints:

n is 10

n is 9

n is 8

n is 7

n is 6

n = 5

n = 4

n = 3

n = 2

n = 1

n = 0

Pointers:

#include <iostream>

using namespace std;

int main() {

int i = 10;//add variable i onto the stack

cout << "The memory address of i is " << &i << "\n";// &i gets the address of i (near the top of the stack)

cout << "The data stored at memory address " << &i << " is " << i << "\n"; //state the data stored at (some hexcode) is 10

int* p = &i;// declare pointer p which points at i, and puts p on the stack

cout << "The value of p is " << p << "\n";// prints p is (some hexcode)

cout << "We say that p 'points at' the memory location referenced by address " << p << "\n";

cout << "The data stored at memory address " << p << " is " << *p << "\n"; //*p evaluates to 10

return 0;

}

/*PROGRAM 1*/

#include <iostream>

using namespace std;

int main(){

int i = 10;

int j = i;//copy the value of i (10) into j

cout << "i= " << i << " and j= " << j << "\n";// i= 10 and j= 10. two different variables with the same value

i=5;

cout << "i= " << i << " and j= " << j << "\n";//i= 5 and j= 10. the value of i changing does not affect j.

j=1;

cout << "i= " << i << " and j= " << j << "\n";//i= 5 and j= 1. the value of j changing does not affect i.

return 0;

}

/*PROGRAM 2*/

#include <iostream>

using namespace std;

int main(){

int i = 10;

int* p = &i;

cout << "i= " << i << " and *p= " << *p << "\n";// i= 10 and *p= 10. they seem to act the same

i=5;

cout << "i= " << i << " and *p= " << *p << "\n";// i= 5 and *p= 5. they still act the same

*p=1;

cout << "i= " << i << " and *p= " << *p << "\n";// i= 1 and *p= 1 they still act the same

return 0;

}

Why is the behaviour of Program1 different than that of Program2?

Program 1 creates two independent variables while Program 2 creates 2 aliases for the same variable.

New:

New puts the variable into the heap instead of on the stack.

#include <iostream>

using namespace std;

int main(){

int* p = new int(5);// put int 5 into the heap, assign its pointer to p.

cout << "p points at address " << p << "\n";//p points at (some hexcode)

cout << "The data stored in address " << p << " is " << *p << "\n";// the data stored in p is 5

*p = 10;// reassign the heap entry to 10

cout << "Now the data stored in address " << p << " is " << *p << "\n";// now the data stored is 10

return 0;

}

Understanding test:

#include <iostream>

using namespace std;

int main(){

//To test your understanding

//try writing a similar code.

//In your code, use the 'new'

//construct to create a new pointer,

//'p1', and point it at a new

//double with value '3.14'. Then,

double data = 3.14;

double* p1 = &data;

//declare another pointer, 'p2',

//and point it at the same double

//as p1. Print the address pointed

double* p2 = p1;

//to by each pointer, and the

//value when the * operator is used

//on each pointer. Finally,

cout<<"p1 is "<<p1<<" and *p1 is "<<*p1<<endl;

cout<<"p2 is "<<p2<<" and *p2 is "<<*p2<<endl;

//multiply the value of the original

//double by 2 (you'll have to use

//the * operator for this) and

//print the values of *p1 and *p2 again.

data=data*2;

cout<<"p1 is "<<p1<<" and *p1 is "<<*p1<<endl;

cout<<"p2 is "<<p2<<" and *p2 is "<<*p2<<endl;

return 0;}

p1 is 0x7ffcf18530c8 and *p1 is 3.14

p2 is 0x7ffcf18530c8 and *p2 is 3.14

p1 is 0x7ffcf18530c8 and *p1 is 6.28

p2 is 0x7ffcf18530c8 and *p2 is 6.28

HW6:

#include <iostream>

using namespace std;

int main() {

// a loop to demonstrate 1D arrays

int ii[3] = {1,2,3};

int j=0;

while (j<3) {

cout <<" ii of "<<j<<" is "<<ii[j]<<endl;//prints each element of the array in turn

j++;//increase j by 1 (could be done as a for loop)

}

// a loop ot demonstrate 2D arrays

int LL[2][3] = {1,2,3,4,5,6};//2 rows 3 columns

j=0;

int k;

while (j<2) {

k=0; // do not forget to initialize k here

while (k<3) {

cout<<" LL of "<<j<<" "<<k<<" is "<<LL[j][k]<<endl;

k++;//print the elements like [ 1 2 3 ]

// [ 4 5 6 ]

}

j++;

}

return 0;

}

#include <iostream>

#include <fstream> // include the filestream operations

using namespace std;// remove std:: prefix

int main() {

ofstream myfile;// create output file stream

myfile.open("example.txt");// create output file named example.txt in current directory

myfile<<"write some junk.";// write "write some junk." to the file stream

myfile.close();// save changes and close the file.

return 0;//exit program

}

HOMEWORK 6:

The demonstration code is exactly the same as the code from homework 5.

How are these lemenst distributed in rows and colums?

The distribution starts with filling the first row. once it's full, it moves on to the second row.

Note: you can, and in some cases should, use braces to specify rows and columns. Fins a simple example on google.

If you have a jagged array where the lengths of the rows are not always equal, you have to specify them using {{},{},{}} syntax.

#include <iostream>

#include <fstream>

using namespace std;

int main() {

ofstream myfile;//create output file stream

myfile.open("example.txt");//name the file example.txt

myfile<<"write some junk.";//output some junk to the file

myfile.close();//flush buffer and save file

return 0;//return without error

}

/* DOTPROD.CPP */

#include <iostream>

using namespace std;

double dot_prod(double v1[3],double v2[3]) {//take 2 3-dimensional vectors

double dotdot;

dotdot = v1[0]*v2[0]+v1[1]*v2[1]+v1[2]*v2[2];//multiply each element and add results

cout<<" The dot product is "<<dotdot<<endl;

return 0;

}

Scalarmult File:

#include <iostream>

/* SCALARMULT.CPP */

void scalar_mult(double vec[3], double scalar) {

std::cout<<"The scalar multiple is ("<<

vec[0]*scalar<<","<<//multiply scalar by each element

vec[1]*scalar<<","<<//print the new vector

vec[2]*scalar<<")"<<std::endl;

}

Main file:

/* MAIN.CPP */

#include <iostream>

#include <fstream>

// include the program dotprod.cpp so that we can find the dot_prod function

#include "dotprod.cpp"

#include "scalarmult.cpp"

using namespace std;

int main () {

// declare the vectors

double vector1[3];

double vector2[3];

// open the input file

ifstream infile;

infile.open("vectors.txt");

// store the input in the vectors and print the vectors for the user

infile>>vector1[0]>>vector1[1]>>vector1[2];

cout<<" Vector 1 is ("<<vector1[0]<<","<<vector1[1]<<","<<vector1[2]<<")"<<endl;

infile>>vector2[0]>>vector2[1]>>vector2[2];

cout<<" Vector 2 is ("<<vector2[0]<<","<<vector2[1]<<","<<vector2[2]<<")"<<endl;

double scale;//declare scalar

infile>>scale;//read scalar from file

cout<<"The scalar is "<<scale<<endl;//print scalar

// close the input file

infile.close();

// call the dot_prod function from dotprod.cpp

dot_prod(vector1,vector2);

scalar_mult(vector1,scale);//multiply the vectors by their scalar

scalar_mult(vector2, scale);//both vectors

return 0;

}

Simulation:

#include "Riostream.h"

#include <iostream>

#include <fstream>

some code to help us understand why and how resolutions affect our ability to measure the

mass of a particle

// main routine

void resolutions(){

// set debug mode

bool idebug=true;

// set up the display

gROOT->Reset();

gROOT->SetStyle("Plain");

gStyle->SetOptStat(0);

gStyle->SetOptFit(11);

gStyle->SetErrorX(0);

c1 = new TCanvas("c1","Our data",200,10,700,500);

c1->SetFillColor(0);

c1->SetBorderMode(0);

c1->SetBorderSize(1);

c1->SetFrameBorderMode(0);

// book some histograms

TH1F *histo1 = new TH1F("histo1","mass",1000,0.,200.);

// setup random number generator

gRandom->SetSeed();

// get the secret parameters

ifstream myfile;

myfile.open("secretparameters.txt");

double mass, resE, resA;

myfile>>mass>>resE>>resA;

myfile.close();

// make N bosons at rest

// each boson will decay to 2 back-to-back daughter particles

// they will be back-to-back by conservation of momentum, since the momentum of the mother was zero

// assume that the mass of the daughter particles is small compared to the mother mass so we can

// assume that their energy will be large compared to their mass and we can treat them as massless.

// and thus their energy is equal to the magnitude of their momenta

// by conservations of energy, their energy

// work in 2D. Can always choose my coordinate system so that the 2 daughters are in the same plane

int N=8000;

double etrue1,etrue2,phitrue1,phitrue2; // true energy of the 2 daughters

double e1,px1,py1,phi1; // smeared 4-momenta of daughter 1

double e2,px2,py2,phi2; // smeared 4-momenta of daughter 2

double masssmeared;

for(int i=0;i<N;i++) {

// set true energy

etrue1=mass/2.;

etrue2=mass/2.;

if(idebug) cout<<"etrue "<<etrue1<<" "<<etrue2<<endl;

// choose phi for daughter 1 and daughter 2

phitrue1=2*TMath::Pi()*gRandom->Rndm();

phitrue2 = phitrue1+TMath::Pi();

if(idebug) cout<<"phitrue "<<phitrue1<<" "<<phitrue2<<endl;

// smear true energy with resolution of detector

e1=etrue1+resE*gRandom->Gaus(0.,1.);

e2=etrue2+resE*gRandom->Gaus(0.,1.);

if(idebug) cout<<"e "<<e1<<" "<<e2<<endl;

//smear angles with resolution of the detector

phi1=phitrue1+resA*gRandom->Gaus(0.,1.);

phi2=phitrue2+resA*gRandom->Gaus(0.,1.);

if(idebug) cout<<"phi "<<phi1<<" "<<phi2<<endl;

//calculate 4 momenta after smearing

px1=e1*cos(phi1);

py1=e1*sin(phi1);

px2=e2*cos(phi2);

py2=e2*sin(phi2);

if(idebug) cout<<"pxs "<<px1<<" "<<py1<<" "<<px2<<" "<<py2<<endl;

// calculate smeared mass

masssmeared=sqrt((e1+e2)*(e1+e2) - (px1+px2)*(px1+px2) - (py1+py2)*(py1+py2));

if(idebug) cout<<"masssmeared "<<masssmeared<<endl;

histo1->Fill(masssmeared);

histo1->Draw("");

c1->Update();

}//add the close brace here.

c1->SaveAs("c1.gif");

}

Transverse momentum histogram:

Homework 7:

Top quark histogram:

Homework 8:

#define HiggsAnalysis_cxx

#include "HiggsAnalysis.h"

#include <TH2.h>

#include <TStyle.h>

#include <TCanvas.h>

void HiggsAnalysis::Loop()

{

// To read only selected branches, Insert statements like:

// METHOD1:

// fChain->SetBranchStatus("*",0); // disable all branches

// fChain->SetBranchStatus("branchname",1); // activate branchname

// METHOD2: replace line

// fChain->GetEntry(jentry); //read all branches

//by b_branchname->GetEntry(ientry); //read only this branch

if (fChain == 0) return;

Long64_t nentries = fChain->GetEntriesFast();

Long64_t nbytes = 0, nb = 0;

TFile* output = TFile::Open("Dielectron_MC.root", "RECREATE"); // "RECREATE" would produce a new root file with name Dielectron_MC.root every time you run the code

TH1F* Z_ee = new TH1F("Z_ee", "Di-electron candidate invariant mass", 200, 0, 200);//histogram of first Z boson

TH1F* H_zz = new TH1F("H_zz", "ZZ candidate invariant mass", 200, 0, 300);// histogram of original higgs

TH1F* Z2_ee = new TH1F("Z2_ee", "Di-electron candidate2 invariant mass",200,0,200);//histogram of second Z boson

double el1mt = 0.0;

double el1pt = 0.0;

double el1eta = 0.0;

for (Long64_t jentry=0; jentry<nentries;jentry++) {

Long64_t ientry = LoadTree(jentry);

if (ientry < 0) break;

nb = fChain->GetEntry(jentry); nbytes += nb;

// if (Cut(ientry) < 0) continue;

TLorentzVector el1, el2, el3, el4;// capture 4 electrons from H->ZZ->(e+e-)(e+e-)

el1.SetPtEtaPhiM(f_lept1_pt, f_lept1_eta, f_lept1_phi, 0.0);

el2.SetPtEtaPhiM(f_lept2_pt, f_lept2_eta, f_lept2_phi, 0.0);

el3.SetPtEtaPhiM(f_lept3_pt, f_lept3_eta, f_lept3_phi, 0.0);

el4.SetPtEtaPhiM(f_lept4_pt, f_lept4_eta, f_lept4_phi,0.0);

TLorentzVector zCandidate = el1 + el2;//add together the mass of the first electron pair

TLorentzVector zCandidate2 = el3+el4;//add together the mass of the second electron pair

TLorentzVector hCandidate = zCandidate+zCandidate2;//add together mass of Z bosons to get higgs

Z_ee->Fill(zCandidate.M());//record first Z boson

Z2_ee->Fill(zCandidate2.M());//record second Z boson

H_zz->Fill(hCandidate.M());//record mass of Higgs

el1mt = el1.Mt();

cout << hCandidate.M() << endl;//print higgs mass

}

Z_ee->Write();//save all 3 histograms to the file.

Z2_ee->Write();

H_zz->Write();

output->Close();

}

(Machine learning thingy)

Homework 9.5 code:

import tensorflow as tf #import all the libraries needed to run

from tensorflow import keras

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

plt.rcParams['figure.figsize'] = (10,10) # Make the figures a bit bigger (try changi

from keras.datasets import cifar10

from keras.datasets import mnist #nist database of handwritten digits. A dataset of 6

from keras.models import Sequential #model consists of sequential layers

from keras.layers import Dense, Dropout, Activation # Dropout consists in randomly se

from keras.utils import np_utils

nb_classes = 10 #10 classes (from 0 to 9) # the data, shuffled and split between train and test sets

(X_train, y_train), (X_test, y_test) = mnist.load_data()

print("X_train original shape", X_train.shape)

print("y_train original shape", y_train.shape)

for i in range(9):

plt.subplot(3,3,i+1)#display 9 sample images including their classification

plt.imshow(X_train[i], cmap='gray', interpolation='none')#show image in greyscale

plt.title("Class {}".format(y_train[i]))#show classification

X_train = X_train.reshape(60000, 784)

X_test = X_test.reshape(10000, 784)

X_train = X_train.astype('float32')#train with floating point data (0%-100%)

X_test = X_test.astype('float32')

X_train /= 255 # try commenting these two lines and see if you can train the netwrk

X_test /= 255

print("Training matrix shape", X_train.shape)

print("Testing matrix shape", X_test.shape)

Y_train = np_utils.to_categorical(y_train, nb_classes)

Y_test = np_utils.to_categorical(y_test, nb_classes)

model = Sequential()# create neural network

model.add(Dense(512, input_shape=(784,)))#add input layer to hidden layer

model.add(Activation('relu')) # An "activation" is just a non-linear function applied to # of the layer above. Here, with a "rectified linear unit" # we clamp all values below 0 to 0.

model.add(Dropout(0.2)) # Dropout helps protect the model from memorizing or "overfitt

model.add(Dense(512)) #create a second hidden layer

model.add(Activation('relu'))

model.add(Dropout(0.2))

model.add(Dense(10))# create an output layer which picks from each digit 0-9

model.add(Activation('softmax')) #turn output layer into probability map

model.compile(loss="categorical_crossentropy", optimizer="adam")# create AI model

model.fit(X_train, Y_train,batch_size=128, epochs=4,verbose=1,validation_data=(X_test, Y_test))#provide the data set to the AI

score = model.evaluate(X_test, Y_test, verbose=1)#get loss

predicted_classes = model.predict_classes(X_test)#get predicted digits

correct_indices = np.nonzero(predicted_classes == y_test)[0]#count number of correct digit guesses

incorrect_indices = np.nonzero(predicted_classes != y_test)[0]# count number of incorrect digit guesses

plt.figure()

for i, correct in enumerate(correct_indices[:9]):#get first 9 correct digits

plt.subplot(3,3,i+1)

plt.imshow(X_test[correct].reshape(28,28), cmap='gray', interpolation='none')#display the images associated with the correct guesse

plt.title("Predicted {}, Class {}".format(predicted_classes[correct], y_test[correct]))

plt.figure()

for i, incorrect in enumerate(incorrect_indices[:9]):#get first 9 incorrect digits

plt.subplot(3,3,i+1)

plt.imshow(X_test[incorrect].reshape(28,28), cmap='gray', interpolation='none')#display the images that tricked the AI

plt.title("Predicted {}, Class {}".format(predicted_classes[incorrect], y_test[incorrect]))#display the predicted digit and the real digit side-by-side

(x_train, y_train), (x_test, y_test) = cifar10.load_data()#this is a bunch of pictures. If this is loaded instead of mnist, it will categorize the pictures

titanic = pd.read_csv("https://raw.github.com/vincentarelbundock/Rdatasets/master/csv/carData/TitanicSurvival.csv")#open data table about titanic survivors

titanic.head(5)#read first five entries and print them to the screen.

Homework 9 code:

pandas=10 #declare pandas (type int) to be 10.

zebras=20 #declare zebras (type int) to be 20.

total_animals_in_zoo=pandas+zebras# add the pandas and zebras, get 30

print(total_animals_in_zoo)#print 30

pi=3.14159 #Approx.

diameter=10

######YOUR CODE HERE####

radius= diameter/2 #find radius

area = radius*radius*pi#circle area formula

print(area)#output area to screen

########################

#####Method 1########

a = [1, 2, 3] #set a={1,2,3}

b = ['cat', 'dog', 'fish']# set b to list of animals

a=b#now, both a and b are lists of animals, and the original numbers list is gone

b=a#this does nothing

print('Method 1')#start method 1

print(a)#prints a list of animals

print(b)#also prints a list of animals

#####Method 2########

a = [1, 2, 3]#set a to numbers

b = ['cat', 'dog', 'fish']#and b to animals

temp = a#create a temporary copy of the numbers list

a = b#set a to animals (doesn't completely delete the numbers list)

b = temp#set b to the temporary copy of the numbers list

print('Method 2')#start method 2

print(a)#prints animals

print(b)#prints numbers (as expected)

dog_legs=0

human_legs=4

goldfish_legs=2

######YOUR CODE HERE###########

temp=dog_legs#hold on to the value 0

dog_legs=human_legs#give dogs 4 legs

human_legs=goldfish_legs#give humans 2 legs

goldfish_legs=temp#give goldfish stored 0 legs

print("dog legs: ",dog_legs)#prints 4

print("human legs: ",human_legs)#prints 2

print("goldfish legs: ",goldfish_legs)#and prints 0, as expected

def round_to_two_places(num):#declare function that takes a number

return round(num, 2)#return itself, rounded to two places

print(round_to_two_places(4.325456463))#print 4.33, rounding up

def round_to_n_places(num, places):#declare general rounding function with optional number of places

#####Your Code Here############

return round(num,places)#return num rounded to (places) digits

print(round_to_n_places(pi,1))#test rounding pi to various digit places

print(round_to_n_places(pi,2))#3.14

print(round_to_n_places(pi,3))#3.142

print(round_to_n_places(pi,4))#3.1416

a=[1,5435,324,3,645365,23,4,5665,435,31, -543] #a is a list with a bunch of numbers

b=-10#b is negative, abs(b) should be positive

c='cat'#cat is 3 long

print(c)#prints "cat"

print(abs(b)) #absolute value

print(min(a)) #min value

print(max(a)) #max value

print(sum(a)) #sum

print(len(c)) #length of a string

print(sorted(a)) #sorts a list

a=[3243,645645,54346]#a=list1

b=[7,3,2]#b=list2

def product_of_maxes(list1, list2):#the function takes 2 list arguments

########YOUR CODE HERE#############

max1 = max(list1)#find first max

max2 = max(list2)#find second max

return max1*max2#return the producs

print(product_of_maxes(a,b))#prints 7*645645 or 4519515

print(3==10)# check equality between 3 and 10 (False)

print(len('abc')==3)#check length of 'abc' and compare to 3 (True)

print(max([10,40,53]) < 25) #find max (53) and compare to 25 (False)

print (5 != 3)#test if 5 is not equal to 3 (True)

#####YOUR CODE HERE###########

#write code that evaluates? ok then

x=eval("1==3 or 4>2")# reduces to (False or True), which evaluates to True

print(x)#print True

a=10#a is now 10

b=20#b is now 20

if(a>b):#fails because 20>10

print(a, 'is greater than', b)#prints a>b if a is indeed bigger than b

elif(a<b):#proceeds because 20>10

print(b, 'is greater than', a)#prints b>a if b is bigger than a

else:#handle other scenarios

print("they're equal")#print they're equal when neither is larger

a=40#test for the opposite

if(a>b):#succeeds because 40>20

print(a, 'is greater than', b)#prints a>b if a is indeed bigger than b

elif(a<b):#fails the first if activated

print(b, 'is greater than', a)#prints b>a if b is bigger than a

else:#handle other scenarios

print("they're equal")#print they're equal when neither is larger

b=40#b test for equality

if(a>b):#fails because 40==40

print(a, 'is greater than', b)#prints a>b if a is indeed bigger than b

elif(a<b):#fails because 40==40

print(b, 'is greater than', a)#prints b>a if b is bigger than a

else:#handle other scenarios

print("they're equal")#print they're equal when neither is larger

def longer_string(string1, string2):#function of 2 strings

if(len(string1) > len(string2)):#run if string1 is longer

return string1#return the longer string

elif(len(string1) < len(string2)):#run if string2 is longer

return string2#return the longer string

else:#when they're the same size

return#don't return either

print(longer_string('abc', 'jfkdla;s'))#print jfkdla;s

print(longer_string('abcfdsafdasg', 'jfkdla;s'))#print abcfdsafdasg

print(longer_string('abc', 'def'))#print None (they're the same length)

def can_make_cake(eggs, flour, milk, almond_milk):#define cake-making function taking ingredients as arguments

#I can make cake if I have some kind of milk AND flour AND eggs

return (milk or almond_milk) and flour and eggs > 0#

print('Can I make cake?')#print the practice introduction

#What is the following line implying?

print(can_make_cake(10, True, False, True))#print true

#redo the above call to the function with 0 eggs

### your code here ###

print(can_make_cake(0,True,False,True))#print false because we need more than 0 eggs

a = [-35423,-5432654,-352435,53252]

def abs_of_min_is_odd(thelist):#take a single list argument

###Your logic here###

isodd = [False,True][abs(min(a))%2]#when abs(min(a)) is even, gets the first element, which is False. When abs(min(a)) is odd, gets True

print(["even","odd"][isodd]) #prints even because abs(min) is 5432654, which is even

abs_of_min_is_odd(a)#call the function with a

def select_third_element(my_list):#

if(len(my_list) < 3):#if the list is too small to have a third element

return None#return None (python equivalent of null)

else:#otherwise

return my_list[2] #remember index 0, so third element is index 2

foo = ['a', 'b', 'c', 'd']# practically the same as foo="abcd"

print(select_third_element(foo))#prints c

list1=[1,2,3]#create rows of a matrix

list2=[4,5,6]

list_of_lists=[list1,list2]#create the double-indexed matrix

print(list_of_lists[1][2])#pick row 2, element 3, which is a 6

print(list_of_lists[0])#pick row 1, or list1 and print [1,2,3]

print(list_of_lists[1])#pick row 2, or list2 and print [4,5,6]

print(list_of_lists[0][1])#pick cell from row 1, column 2 (2)

print(list_of_lists[0][2])#pick cell from row 1, column 3 (3)

print(list_of_lists[1][0])#pick cell from row 2, column 1 (4)

real_madrid=['zidane', 'ramos', 'marcelo']#obviously the best team

barcelona=['valverde', 'messi', 'pique']#obviously the worst team

teams = [real_madrid, barcelona]#put the teams in a list

def losing_team_captain(team_list):#create function to identify the sacrifice... I mean losing team's captain

#######Your code here###########

losing_team=team_list[len(team_list)-1]#pick from the end of the list

print(losing_team[1])#print captain of losing team (messi)

losing_team_captain(teams)#call function with the teams

standings=['mario', 'bowser', 'luigi','peach']#good job mario, you're winning. And peach sucks

def purple_shell(racers):#uh, oh, I think peach is gonna get her revenge

#######YOUR CODE HERE##########

back=racers[len(racers)-1]#hold on to a copy of the last place

racers[len(racers)-1]=racers[0]#bring the frontrunner to last place

racers[0]=back #the former last-place person is now in front

purple_shell(standings)#make peach use a purple shell

print(standings)#print the updated standings with peach in front and mario at the back

def list_contains_seven(my_list):#function identifying sevens

for element in my_list:#go through the list

if element ==7:#if there is a seven

return True#return true

return False#return false when there are no sevens :(

print(list_contains_seven([6,14,5,7]))#print True

print(list_contains_seven([6,14,5,9]))#print False

def count_to_10(num):#counts to 10, starting from num+1

while(num<10):#so long as num is less than to

num=num+1#icrease num

print(num)#then print it (this can print 10)

count_to_10(0)#print 1,2...10

count_to_10(5)#prints 6,7...10

count_to_10(11)#prints nothing

list=[1,21,7,-14,49,74,700]#this has a lot of numbers divisible by seven (there are 5)

def seven_counter(my_list):#count the number of multiples of seven

return sum([b%7==0 for b in my_list])#sum up the occurences of multiples of seven

print(seven_counter(list))#prints 5

############################DO NOT TOUCH THIS CODE#########################

deck={} #represents a deck of cards where the keys are the suits and values are lists of card values

hearts=[] # represents all the hearts in a deck

clubs=[] # represents the clubs in a deck

diamonds=[] # represents the diamonds in a deck

spades=[]# represents the spades in a deck

values=['ace', 'two', 'three', 'four', 'five', 'six','seven', 'eight', 'nine', 'jack', 'king', 'queen'] #creates a list of possible card values

def create_cards(suit):#function that copies the contens of values into the list given as an argument

for value in values:#for each element

suit.append(value)#add it to the given list

create_cards(hearts)#create all the hearts

create_cards(clubs)#create all the clubs

create_cards(diamonds)#create all the diamonds

create_cards(spades)#create all the spades

##################################Add your code here according to the comments ##########################################

#Use the strings 'h', 's', 'c', 'd' as keys, and add the lists of values to the dictionary.

deck['h']=hearts#assign hearts

deck['s']=spades#assign spades

deck['c']=clubs#assign clubs

deck['d']=diamonds#assign diamonds

#Print a view of dictionary (key, value) pairs

print(deck)

#Print a view of all of the keys

print(deck.keys())

#Print a view of all of the values

print(deck.values())

#Remove all of the spades from the deck

deck.pop('s')

print(deck)

#Add a new entry to the dictionary with key 'j' and values 'joker1' and 'joker2'

deck['j']=['joker1','joker2']

print(deck)

#Clear the dictionary

deck.clear()

print(deck)

import math as m #I use the as m part so when I refer to the library I can just type m instead of math. Its just an abbreviation.

print(m.pi, m.log(32, 2))#print 3.1415... then print 5

print(m.gcd(75348597,979531683))#such factor, much aglorithm, wow

print(m.cos(10))#cos 10 radians is about -0.84. it's pretty close to -sin(1)

#Here's a more advanced example.

import matplotlib#importing math & graphics libraries

import matplotlib.pyplot as plt

import numpy as np

# Data for plotting

t = np.arange(0.0, 2.0, 0.01)#give t as an interval from 0 to 2 with a step of .01 between

s = 1 + np.sin(2 * np.pi * t) # is s a list variable? awesome! I guess np.sin responds with a custom math object which overrides the + operator. This is a bit much to expect us to understand, but it's cool nonetheless.

fig, ax = plt.subplots()# retrieving easy-to-use plot functionality

ax.plot(t, s)#plot the graph of s given t as a range

ax.set(xlabel='time (s)', ylabel='voltage (mV)',#label t with time and s with voltage (a sine wave)

title='About as simple as it gets, folks')#yeah, right

ax.grid()#enable grid display so we know the coordinate system

plt.show()#display the plot

#in case you were wondering, I wrote every single comment in this homework before running the code.

data = {'a': np.arange(50),#set data to be a dictionary with 3 numpy variables. a=[0..49]

'c': np.random.randint(0, 50, 50),#c= random from 0 to 50

'd': np.random.randn(50)}#d= some random double centered at 0

data['b'] = data['a'] + 10 * np.random.randn(50)#add a new variable, b=(index)+ a random number

data['d'] = np.abs(data['d']) * 100#reassign d to its absolute value

plt.scatter('a', 'b', c='c', s='d', data=data)#draw a scatter plot, color=c, size=d

plt.xlabel('entry a')#give the horizontal axis label a (the value which ranges from 0-49)

plt.ylabel('entry b')#label the vertical axis b (linear plus a random factor)

plt.show()#display the plot

#now I have to make my own numpy thing? if you say so...

t=np.arange(0,7,.01)#parametric plot, ranging t from 0-7

xx=[]#array of x points

yy=[]#and y points

x0=0#placement of x

y0=0#and y

for tt in t:#integrals are done best in a loop

x0+=m.cos(2*m.sin(tt))/100#increase x

y0+=m.sin(2*m.sin(tt))/100#increase y

xx.append(x0)#add x to xlist

yy.append(y0)#add y to ylist

x0=0#reset x for negative side

y0=0#reset y for negative side

xx=xx[::-1]#reverse list to fix discontinuity

yy=yy[::-1]#reverse list to fix discontinuity

for tt in t:#run through the negative plot

x0-=m.cos(2*m.sin(tt))/100#decrease x

y0+=m.sin(2*m.sin(tt))/100#increase y

xx.append(x0)#add x to xlist

yy.append(y0)#add y to ylist

fig, ax = plt.subplots()# retrieving easy-to-use plot functionality

ax.plot(np.array(xx), np.array(yy))#plot the graph of s given t as a range

ax.set(title='awesome dragon curves')#give it a righteous title

ax.grid()#enable grid display so we know the coordinate system

plt.show()#display the plot

(images)

(the better sine waves)

Page updated

Report abuse