MSQE (2020-2021)
Introduction with R software:
It was begun in the 1990s by Robert Gentleman and Ross Ihaka of the Statistics Department of the University of Auckland. Nearly 20 senior statisticians provide the core development group of the R language, including the primary developer of the original §language, John Chambers, of Bell Labs. It contains a large, coherent, integrated collection of intermediate tools for data analysis. The software is FREE and can be downloaded from http://www.r-project.org/ or you can download from R software . The versatility of this software is that it can be used in every domain and the coding structure in this software is quite easy rather than the others. Moreover, R software is used not only for coding purpose, you can also prepare any manuscript or prepare any sorts of presentation in this software. R markdown will help in this case. The major yardstick of the software is the R packages. Basically, Packages are collections of R functions, data, and compiled code in a well-defined format. The directory where packages are stored is called the library. Currently, the CRAN package repository features 10964 available packages.https://cran.r-project.org/web/packages/. To install packages type install.packages(pkgs, dependencies=TRUE) in the console.
In most of my research work i have used this software. Moreover, for any sorts of discrepancy do not hesitate to contact me. The contact details is provided in my home page. Don't worry this page will be update as per your needs.
References:
An introduction to R, Longhow Lam. (Link for the material https://cran.r-project.org/doc/contrib/Lam-IntroductionToR_LHL.pdf )
Applied Statistical Inference: Likelihood and Bayes by Leohard Held and Daniel Sabanes Bove, Springer-Verlag Berlin 2014
The R Student Companion, Brian Dennis, CRC Press, 2013.
An Introduction to Statistical Learning with Applications in R by James, Witten, Hastie and Tibshirani, Springer Text in Statistics 2013
Using R for Numerical Analysis in Science and Engineering by Victor A. Broomfield, CRC Press. Taylor and Francis Group 2014
A Primer of Ecology with R by M. Henry and H. Stevens, Springer 2009
Statistical Modeling: The Two Cultures by Leo Breiman, Statistical Science 2001, Vol. 16, No. 3, 199-231.
The Art of R Programming; Norman Matloff
AN R COMPANION FOR THE HANDBOOK OF BIOLOGICAL STATISTICS; SALVATORE S. MANGIAFICO : https://rcompanion.org/documents/RCompanionBioStatistics.pdf
Introduction in R Programming:
17th February 2021 (11.30 a.m. - 12.20 p.m.)
Before diving into the statistical analysis of the data, we should accustomed or familiar with the R software a little bit. The following lecture will help you to establish a basic understanding with the R code.
# Calculation
5+9
5-9
5*9
5/9
# Data and Data Vectors
grade = 90
grade
### Process of vector creation is write c()
grade = c(95, 85, 100, 90, 100) ## Vector Creation
grade
length(grade)
name = c("Liili", "Priya", "John", "Bob", "Julia")
name
## CTRL + ENTER, This is the shortcut command to execute the code
### through Key Board.
## Data frame
mydata = data.frame(grade, name)
mydata
## Functions
dim(mydata)
nrow(mydata) ## nrow = number of row
ncol(mydata) ## ncol = number of column
str(mydata) ## str stands for the structure of my data set
summary(mydata) ## summary will provide you the basic statistical
## information
temp = mydata$grade
mean(temp)
sd(mydata$grade) ### sd stands for the standrad deviation
var(temp)
# Directory and Files
getwd() ## Comment for getting the working directory
setwd("C:/Users/user/Documents") ### I want to set the working directory
write.csv(mydata, "myfile.csv") ## Write stands for the exporting of the data set
## comma seperated value/ comma delimited value
data = read.csv("myfile.csv", header = T)
View(data) ## V is in capital letter
str(data)
## Data with R
data()
data(iris)
str(iris)
summary(iris)
head(iris,2) ### checking the initial portion of the dataset
tail(iris) ## Checking the last six data set
iris[1:3, 1:2] ### iris[Row Number,Column Number]
iris[1:3, ]
Graphical Analysis in R software
26th February 2021 (11.30 a.m. - 12.20 p.m.)
Construction of graphs is very much important in every domain of research. Keeping this thing in mind, today we will give e brief demonstration on the construction of graph in the R software.
### Graphical approach in R software ####
### First, i will show some basic methods
### Create a vector
x = c(1, 5, 7, 9, 11, 13, 20, 25); x
plot(x)
### In the following three codes note the differences
plot(x, type = "l") ### Returns a line graph
plot(x, type = "p") ### Returns a graph having only point i.e. scatter plot
plot(x, type = "b") ### Returns a graph having both point & line
### Some basic properties of the graph
plot(x, type = "b", xlab = "week", ylab = "price") ### Adding the label name in the graph
plot(x, type = "b", xlab = "week", ylab = "price", main = "Price increament in 8 weeks") ### Mentioning the caption above the graph
plot(x, type = "b", xlab = "week", ylab = "price", sub = "Price increament in 8 weeks") ### Mentioning the below above the graph
plot(x, type = "b", xlab = expression(bold(Week)),
ylab = expression(bold(Price)), main = "Price increament in 8 weeks") ## Bolding the label name
plot(x, type = "b", xlab = expression(bold(Week)),
ylab = expression(bold(Price)), main = "Price increament in 8 weeks", cex.lab = 1.3) ## Increase the label size
plot(x, type = "b", xlab = expression(bold(Week)),
ylab = expression(bold(Price)), main = "Price increament in 8 weeks",
cex.lab = 1.3, col = "blue", lwd = 2) ## Increase the label size
plot(x, type = "b", xlab = expression(bold(Week)),
ylab = expression(bold(Price)), main = "Price increament in 8 weeks",
cex.lab = 1.3, col = "blue", lwd = 2, xlim = c(1, 7), ylim = c(0, 15)) ## Increase the label size
### Adding legend to the graph
legend("topleft", legend = ("Stock market price"), col = "blue", lwd = 2, bty = "n")
#### Exporting the graph #####
### 1. Look at the right hand side where the plot has appeared. Just above that graph "Export"
### command is present.
### 2. Click on that command and select your type i.e. either in pdf version or image version.
### 3. If it is image version then after clicking the image a new window will appeared and just save ### your image properly.
#### Adding more another graph in a single a graph
y = seq(1, 10); y ##seq stands for the sequence of number generation
lines(y, col = "red", lwd = 2, type = "b") ## lines command is used to superimpose any graph
plot(x, type = "b", xlab = expression(bold(Week)),
ylab = expression(bold(Price)), main = "Price increament in 8 weeks",
cex.lab = 1.3, col = "blue", lwd = 2, xlim = c(1, 7), ylim = c(0, 15)) ## Increase the label size
points(y, col = "red", lwd = 2)
##### Graphical presentation of x vs y ######
#plot(x, y, type = "b", col = "green", lwd = 1.5) #### Look at the output carefully. Try to understand it
x = seq(1, 20); y = seq(21, 40)
plot(x, y, lwd = 3, col = "orange", type = "b") #### Try to explain the functional form this graph
### Try to understand the difference
### Barplot gives you the frequency distribution only
### but histogram provides you both the frequency diagram and probability distribution
#### Detail analysis of the histogram
x = rnorm(n = 1000, mean = 0.3, sd = 1.2); x
h1 = hist(x, probability = T) ### for the frequency diagram
par(mfrow = c(2,2))
h2 = hist(x, probability = T, col = "orange") ### for the distribution diagram
h2 = hist(x, probability = T, breaks = 25, col = "grey")
print(h2)
barplot(x)
#### Sampling Experiment
y = rnorm(500, 0, 1)
hist(y)
z = rnorm(1000, 0, 1)
hist(z)