Functions in R - Built-in Functions
In R, functions are a fundamental component that allow you to encapsulate logic, perform specific tasks, and reuse code. R provides a wide range of built-in functions that perform basic operations like mathematical calculations, statistical analysis, and data manipulation.
Let's explore some of the most commonly used built-in functions in R, organized by category:
These functions perform mathematical operations on numeric data.
sum(): Calculates the sum of elements in a vector or matrix.
R
x <- c(1, 2, 3, 4)
sum(x) # Output: 10
mean(): Calculates the mean (average) of a numeric vector.
R
mean(x) # Output: 2.5
median(): Finds the median value of a numeric vector.
R
median(x) # Output: 2.5
sd(): Computes the standard deviation of a numeric vector.
R
sd(x) # Output: 1.290994
var(): Computes the variance of a numeric vector.
R
Copy code
var(x) # Output: 1.666667
abs(): Returns the absolute value of each element in a vector.
R
abs(c(-3, -4, 5)) # Output: 3 4 5
sqrt(): Calculates the square root of a number.
R
sqrt(16) # Output: 4
log(): Computes the natural logarithm.
R
log(10) # Output: 2.302585
exp(): Computes the exponential of a number.
R
exp(1) # Output: 2.718282
R provides a variety of functions for statistical analysis and hypothesis testing.
range(): Finds the minimum and maximum values in a vector.
R
range(x) # Output: 1 4
quantile(): Computes specified quantiles of a numeric vector.
R
quantile(x, 0.25) # Output: 1.75 (1st quartile)
cor(): Computes the correlation coefficient between two variables.
R
y <- c(2, 4, 6, 8)
cor(x, y) # Output: 1 (perfect positive correlation)
cov(): Computes the covariance between two variables.
R
cov(x, y) # Output: 5
t.test(): Performs a t-test to compare the means of two groups.
R
t.test(x, y)
lm(): Fits a linear model to the data.
R
model <- lm(y ~ x)
summary(model) # Shows regression summary
These functions are useful for manipulating data structures like vectors, matrices, and data frames.
length(): Returns the number of elements in an object (vector, list, etc.).
R
Copy code
length(x) # Output: 4
dim(): Returns the dimensions (rows and columns) of an object like a matrix or data frame.
R
Copy code
dim(my_matrix) # Output: 2 3 (2 rows, 3 columns)
nrow(): Returns the number of rows in a matrix or data frame.
R
nrow(my_data) # Output: 4
ncol(): Returns the number of columns in a matrix or data frame.
R
ncol(my_data) # Output: 4
subset(): Subsets a data frame based on logical conditions.
R
subset(my_data, Age > 30)
order(): Sorts the elements of a vector or data frame.
R
order(x) # Output: 1 2 3 4 (sorted indices)
merge(): Merges two data frames by common columns or row names.
merge(df1, df2, by = "ID")
rbind(): Combines two data frames or matrices by adding rows.
R
rbind(df1, df2)
cbind(): Combines two data frames or matrices by adding columns.
R
cbind(df1, df2)
These functions are used to manipulate and process strings (text data).
nchar(): Returns the number of characters in a string.
R
nchar("Hello") # Output: 5
substr(): Extracts a substring from a string.
R
substr("Hello", 1, 3) # Output: "Hel"
toupper(): Converts a string to uppercase.
R
toupper("hello") # Output: "HELLO"
tolower(): Converts a string to lowercase.
R
tolower("HELLO") # Output: "hello"
paste(): Concatenates strings together.
R
paste("Hello", "World") # Output: "Hello World"
gsub(): Replaces occurrences of a pattern in a string.
R
gsub("o", "0", "Hello") # Output: "H0ell0"
These functions return logical (TRUE or FALSE) results based on conditions.
is.na(): Checks if elements are NA (missing values).
R
is.na(c(1, 2, NA)) # Output: FALSE FALSE TRUE
any(): Checks if any element of a vector is TRUE.
R
any(c(TRUE, FALSE, FALSE)) # Output: TRUE
all(): Checks if all elements of a vector are TRUE.
R
all(c(TRUE, TRUE, TRUE)) # Output: TRUE
which(): Returns the indices of elements that are TRUE.
R
which(c(TRUE, FALSE, TRUE)) # Output: 1 3
R has several functions for working with dates and times.
Sys.Date(): Returns the current date.
R
Sys.Date() # Output: "2024-12-19"
as.Date(): Converts a character string to a Date object.
R
as.Date("2024-12-19") # Output: "2024-12-19"
difftime(): Computes the difference between two dates or times.
R
difftime(Sys.Date(), as.Date("2024-01-01"))
format(): Formats a date or time object.
R
format(Sys.Date(), "%B %d, %Y") # Output: "December 19, 2024"
str(): Displays the structure of an R object (e.g., data frame, list).
R
str(my_data)
summary(): Provides summary statistics for an object (e.g., mean, min, max for numeric vectors).
R
summary(my_data)
class(): Returns the class of an object.
R
class(my_data) # Output: "data.frame"
dim(): Returns the dimensions of an object (e.g., rows and columns for a matrix or data frame).
R
Copy code
dim(my_data)
R provides a rich set of built-in functions for various tasks, including mathematical calculations, data manipulation, string processing, statistical analysis, and working with dates and times. These functions can help you perform most of the common tasks involved in data analysis and programming in R.