POST GRADUATE DIPLOMA IN AGRICULTURAL AND RURAL MANAGEMENT WITH STATISTICAL METHODS AND ANALYTICS

Introduction with R software:

It was begun in the 1990s by Robert Gentleman and Ross Ihaka of the Statistics Department of the University of Auckland. Nearly 20 senior statisticians provide the core development group of the R language, including the primary developer of the original §language, John Chambers, of Bell Labs. It contains a large, coherent, integrated collection of intermediate tools for data analysis. The software is FREE and can be downloaded from http://www.r-project.org/ or you can download from R software . The versatility of this software is that it can be used in every domain and the coding structure in this software is quite easy rather than the others. Moreover, R software is used not only for coding purpose, you can also prepare any manuscript or prepare any sorts of presentation in this software. R markdown will help in this case. The major yardstick of the software is the R packages. Basically, Packages are collections of R functions, data, and compiled code in a well-defined format. The directory where packages are stored is called the library. Currently, the CRAN package repository features 10964 available packages.https://cran.r-project.org/web/packages/. To install packages type install.packages(pkgs, dependencies=TRUE) in the console.

In most of my research work i have used this software. Moreover, for any sorts of discrepancy do not hesitate to contact me. The contact details is provided in my home page. Don't worry this page will be update as per your needs.

References:

An introduction to R, Longhow Lam. (Link for the material https://cran.r-project.org/doc/contrib/Lam-IntroductionToR_LHL.pdf )
Applied Statistical Inference: Likelihood and Bayes by Leohard Held and Daniel Sabanes Bove, Springer-Verlag Berlin 2014
The R Student Companion, Brian Dennis, CRC Press, 2013.
An Introduction to Statistical Learning with Applications in R by James, Witten, Hastie and Tibshirani, Springer Text in Statistics 2013
Using R for Numerical Analysis in Science and Engineering by Victor A. Broomfield, CRC Press. Taylor and Francis Group 2014
A Primer of Ecology with R by M. Henry and H. Stevens, Springer 2009
Statistical Modeling: The Two Cultures by Leo Breiman, Statistical Science 2001, Vol. 16, No. 3, 199-231.
The Art of R Programming; Norman Matloff
AN R COMPANION FOR THE HANDBOOK OF BIOLOGICAL STATISTICS; SALVATORE S. MANGIAFICO : https://rcompanion.org/documents/RCompanionBioStatistics.pdf

Introduction in R Programming:

20th February 2021 (12.30 p.m. - 2.00 p.m.)

Before diving into the statistical analysis of the anthropometric data, we should accustomed or familiar with the R software a little bit. The following lectures will help you to establish a basic understanding with the R code.

# Calculation

5+9

5-9

5*9

5/9

# Data and Data Vectors

grade = 90

grade

### Process of vector creation is write c()

grade = c(95, 85, 100, 90, 100) ## Vector Creation

grade

length(grade)

name = c("Liili", "Priya", "John", "Bob", "Julia")

name

## CTRL + ENTER, This is the shortcut command to execute the code

### through Key Board.

## Data frame

mydata = data.frame(grade, name)

mydata

## Functions

dim(mydata)

nrow(mydata) ## nrow = number of row

ncol(mydata) ## ncol = number of column

str(mydata) ## str stands for the structure of my data set

summary(mydata) ## summary will provide you the basic statistical

## information

temp = mydata$grade

mean(temp)

sd(mydata$grade) ### sd stands for the standrad deviation

var(temp)

# Directory and Files

getwd() ## Comment for getting the working directory

setwd("C:/Users/user/Documents") ### I want to set the working directory

write.csv(mydata, "myfile.csv") ## Write stands for the exporting of the data set

## comma seperated value/ comma delimited value

data = read.csv("myfile.csv", header = T)

View(data) ## V is in capital letter

str(data)

## Data with R

data()

data(iris)

str(iris)

summary(iris)

head(iris,2) ### checking the initial portion of the dataset

tail(iris) ## Checking the last six data set

iris[1:3, 1:2] ### iris[Row Number,Column Number]

iris[1:3, ]

Second Class on the R software

19th February 2021 (3.30 p.m. - 4.30 p.m.)

In today's class we will learn the process to handle with the real life data set. Then we will understand about the procedure of statistical analysis with respect to this real life data set. So, before commencing the class please download this Data. Initially I will demonstrate the technique sampling theory through the R software. Note that, the sampling theory is present in the Lecture of Prof. Bhattacharya.

############# Data

Marks = rnorm(1000, 65, 10)

### rnorm is the notation for collecting

## random sample from the normal distribution

Marks = round(Marks, 0)

Marks

#######graphical representation

plot(Marks, xlab="students no.", ylab="Marks")

######frequency distribution

par(mfrow=c(2, 2))

hist(Marks, probability = F)

#prob=T)

#lines(density(Marks, kernal="gaussian"))

##########sample of size 100

sample100=sample(Marks, 100)

sample100

######sample frequency distribution

hist(sample100, col="red")

##########sample of size 500

sample500=sample(Marks, 500)

sample500

######sample frequency distribution

hist(sample500, col="blue")

##########sample of size 100 100 times

samplecomb=matrix(0, 100, 100)

## matrix command is used to create the matrix in R Software.

for (i in 1:100)

{

samplecomb[i, ]=sample(Marks, 100)

}

samplecombrwmean=rowMeans(samplecomb[, 1:100])

hist(samplecombrwmean, col="green")

############# Data

Marksmsc1 = rnorm(1000, 65, 10)

Marksmsc1 = round(Marksmsc1, 0)

############# Data

set.seed(897)

Marksmsc2 = rnorm(1000, 71, 10) + rnorm(1000, 0, 2)

Marksmsc2 = round(Marksmsc2, 0)

#######graphical representation

plot(Marksmsc1, Marksmsc2)

Marksmscdata=data.frame(Marksmsc1, Marksmsc2)

cormsc12=cor(Marksmsc1, Marksmsc2)

regmsc12=lm(Marksmsc2~Marksmsc1, Marksmscdata)

abline(regmsc12, col = "red")

############ Ayan's Lecture #############

mydata = read.csv(file.choose(), header = T)

head(mydata)

str(mydata)

## Using Packages

library(psych)

pairs.panels(mydata)

height = mydata$ht

weight = mydata$wt

plot(height, weight)

cor.test(height, weight)

barplot(height)

hist(height, probability = T)

hist(weight, probability = T)

boxplot(height)

boxplot(weight)

### Regression ####

fit.linear = lm(weight~height)

summary(fit.linear)

plot(height, weight)

lines(height, predict(fit.linear), col = "red", type = "l")

Third Class on the R software

27th February 2021 (12.30 - 2.30 p.m.)

In today's class we will learn the process to handle with the real life data set. Then we will understand about the procedure of statistical analysis with respect to this real life data set. So, before commencing the class please download this Data and Paired Data. You can easily find the study material of the hypothesis testing in the link Testing of Hypothesis.

data = read.csv(file.choose(), header = T)

head(data)

### Test of Normality ##

shapiro.test(data$ht)

### One sample t test

one.sample.test = t.test(x = data$bmi, mu = 17, alternative = "two.sided")

if(one.sample.test$p.value < 0.05)

print("The alternative hypotheis is accepted")

##### Paired t test

paired.data = read.csv(file.choose(), header = T)

View(paired.data)

x = paired.data$Before.Diet; x

y = paired.data$After.Diet; y

t.test(x, y, paired = T)

##### Two sample t test #####

systolic_bp = data$sbp

diastolic_bp = data$dbp

t.test(systolic_bp,diastolic_bp, alternative = "two.sided")

## Variance Test

var.test(systolic_bp,diastolic_bp, alternative = "two.sided")