PGDARSMA (2022)

Introduction with R software

It was begun in the 1990s by Robert Gentleman and Ross Ihaka of the Statistics Department of the University of Auckland. Nearly 20 senior statisticians provide the core development group of the R language, including the primary developer of the original §language, John Chambers, of Bell Labs. It contains a large, coherent, integrated collection of intermediate tools for data analysis. The software is FREE and can be downloaded from  http://www.r-project.org/  or you can download from R software . The versatility of this software is that it can be used in every domain and the coding structure in this software is quite easy rather than the others. Moreover, R software is used not only for coding purpose, you can also prepare any manuscript or prepare any sorts of presentation in this software. R markdown will help in this case. The major yardstick of the software is the R packages. Basically, Packages are collections of R functions, data, and compiled code in a well-defined format. The directory where packages are stored is called the library. Currently, the CRAN package repository features 10964 available packages.https://cran.r-project.org/web/packages/. To install packages type install.packages(pkgs, dependencies=TRUE) in the console.

In most of my research work i have used this software. Moreover, for any sorts of discrepancy do not hesitate to contact me. The contact details is provided in my home page. Don't worry this page will be update as per your needs. 

References:

Visualization of Statistics by the R software

12th February 2022 (11.00 a.m. - 11.30 a.m.)

In today's class we will learn the process to handle with the real life data set. Then we will understand about the procedure of statistical analysis with respect to this real life data set. Initially Prof. Bhattacharya  will demonstrate the technique sampling theory through the R software. Whatever you have learnt by the theoretical methods that will be visualized by the R software. 

####### Data


Marks = rnorm(1000, 65, 10) 

### rnorm is the notation for collecting

## random sample from the normal distribution

Marks = round(Marks, 0)


Marks


#######graphical representation


plot(Marks, xlab="students no.", ylab="Marks")


######frequency distribution


par(mfrow=c(2, 2))


hist(Marks, probability = F)


#prob=T)

#lines(density(Marks, kernal="gaussian"))


##########sample of size 100


sample100=sample(Marks, 100)


sample100

######sample frequency distribution


hist(sample100, col="red")


##########sample of size 500


sample500=sample(Marks, 500)


sample500


######sample frequency distribution


hist(sample500, col="blue")


##########sample of size 100 100 times


samplecomb=matrix(0, 100, 100)

## matrix command is used to create the matrix in R Software. 

for (i in 1:100)


{


samplecomb[i, ]=sample(Marks, 100)


}


samplecombrwmean=rowMeans(samplecomb[, 1:100])


hist(samplecombrwmean, col="green")


############# Data

Marksmsc1 = rnorm(1000, 65, 10)



Marksmsc1 = round(Marksmsc1, 0)


############# Data


set.seed(897)


Marksmsc2 = rnorm(1000, 71, 10) + rnorm(1000, 0, 2)


Marksmsc2 = round(Marksmsc2, 0)


#######graphical representation


plot(Marksmsc1, Marksmsc2)


Marksmscdata=data.frame(Marksmsc1, Marksmsc2)


cormsc12=cor(Marksmsc1, Marksmsc2)


regmsc12=lm(Marksmsc2~Marksmsc1, Marksmscdata)


abline(regmsc12, col = "red")

Introduction in R Programming: 

12th November 2022 (11.30 p.m. - 12.30 p.m.)

Before diving into the statistical analysis of the anthropometric data, we should accustomed or familiar with the R software a little bit. The following lectures will help you to establish a basic understanding with the R code. 

x = 3

y = 4

print(x)


z = x + y

print(z)

print(z)


z = x-y

print(z)


z = x*y

print(z)


z = x^2

print(z)


# Idea of storing numbers in R

time = 1:12

print(time)

length(time)

size = c(3.4, 4.2, 4.6, 5.1, 5.8, 6.1, 6.3, 6.8,7.1,9.3, 9.5, 9.9)

length(size)

data = data.frame(time, size)

print(data)

View(data)


plot(time, size)

plot(time, size, col = "red")

plot(time, size, col = "red", type = "l")

plot(time, size, col = "red", type = "b")

plot(time, size, col = "red", type = "b", pch = "*")

plot(time, size, col = "red", type = "b", pch = "*", cex = 3)

plot(time, size, col = "red", type = "b", pch = "*", cex = 3, lwd = 2)

plot(time, size, col = "red", type = "b", pch = "*", cex = 3, lwd = 2, xlab ="Time (weeks)")

plot(time, size, col = "red", type = "b", pch = "*", cex = 3, lwd = 2, xlab ="Time (weeks)", ylab = "Size (in cm)")


Fundamental of Statistics by the R software

12th November 2022 (4.00 p.m. - 5.00 p.m.)

In this session we will learn the basic properties of the statistical methods through the R software, which is designed according to the syllabus structure of PGDARSMA. I am now first listing the tools, which will be demonstrated below.

#### Cumulative Frequency/Ogive #####


# declaring data points

data_points= c(1, 2, 3, 5, 1, 1, 2,4, 5, 1, 2, 3, 3)

# declaring the break points

break_points = seq(0, 6, by=1)

# transforming the data

data_transform = cut(data_points, break_points,right=FALSE)

# creating the frequency table

freq_table = table(data_transform)

# printing the frequency table

print("Frequency Table")

print(freq_table)

# calculating cumulative frequency

cumulative_freq = c(0, cumsum(freq_table))

print("Cumulative Frequency")

print(cumulative_freq)

# plotting the data

plot(break_points, cumulative_freq,

     xlab="Data Points",

     ylab="Cumulative Frequency")

# creating line graph

lines(break_points, cumulative_freq)


### Histogram and Bar diagram draw ###

### Through the excel data ###


setwd("C:/Users/user/Desktop/Giridih")

data = read.csv("agri.csv")

View(data)

age = data$age


hist(age, probability = T)

barplot(age)


## Box plot

boxplot(age)



### Mean, median, mode ## na.rm 


mean(age)

median(age)


library(DescTools)

Mode(age)


## Variance, SD


var(age)

sd = sqrt(var(age))


## Skewness and Kurtosis 

x = c(0.012,0.092,0.107,0.026,0, 0, 0, 0,0,0)

library(e1071)

skewness(x)

kurtosis(x)



### Correlation coefficient 

sbp = data$sbp


cor(age, sbp)