Data Types & Data Structures in R: Vectors
In R, understanding data types and data structures is crucial for efficient programming and data analysis. Vectors are one of the most fundamental data structures in R, and they are used to store sequences of data elements.
A vector in R is an ordered collection of elements of the same data type (numeric, character, logical, etc.). Vectors are the most basic data structure in R and are used extensively throughout the language.
Types of Vectors in R:
Numeric Vectors: Contain numbers (both integers and floating-point numbers).
Character Vectors: Contain text (strings).
Logical Vectors: Contain Boolean values (TRUE or FALSE).
Complex Vectors: Contain complex numbers (numbers with a real and imaginary part).
Raw Vectors: Contain raw bytes, but are used less frequently.
You can create a vector using the c() (combine) function in R, which combines individual elements into a single vector.
Examples:
Numeric Vector:
R
numeric_vector <- c(1, 2, 3, 4, 5)
print(numeric_vector)
Character Vector:
R
char_vector <- c("apple", "banana", "cherry")
print(char_vector)
Logical Vector:
R
logical_vector <- c(TRUE, FALSE, TRUE, TRUE)
print(logical_vector)
Complex Vector:
R
complex_vector <- c(1 + 2i, 3 + 4i, 5 + 6i)
print(complex_vector)
You can access the elements of a vector using indexing. In R, indexing starts from 1 (unlike some programming languages where indexing starts from 0).
Examples:
Accessing an element by index:
R
print(numeric_vector[3]) # Accesses the 3rd element (which is 3)
Accessing multiple elements (using a vector of indices):
R
print(numeric_vector[c(1, 4)]) # Accesses the 1st and 4th elements (1, 4)
Accessing elements with logical vectors:
R
print(numeric_vector[c(TRUE, FALSE, TRUE, FALSE, TRUE)])
R allows for vectorized operations, meaning operations can be applied directly to vectors without needing loops.
Examples:
Arithmetic Operations:
R
# Adding 2 to each element of the vector
numeric_vector + 2
Element-wise Multiplication:
R
numeric_vector * 3
Vector Concatenation:
R
combined_vector <- c(numeric_vector, c(6, 7, 8))
print(combined_vector)
Sum of All Elements:
sum(numeric_vector) # Sum of all elements in the vector
You can check the number of elements in a vector using the length() function.
Example:
R
length(numeric_vector) # Returns the length of the vector (5)
You can modify elements of a vector by directly referencing the index and assigning a new value.
Examples:
Changing a Single Element:
R
numeric_vector[2] <- 10 # Changes the second element to 10
print(numeric_vector)
Adding Elements:
R
numeric_vector <- c(numeric_vector, 6) # Adds 6 to the end of the vector
print(numeric_vector)
Vectors in R support several powerful operations that allow you to perform complex tasks efficiently.
1. Mathematical Operations:
You can perform arithmetic operations on vectors directly, and they will be applied element-wise.
R
vec1 <- c(1, 2, 3)
vec2 <- c(4, 5, 6)
vec1 + vec2 # Adds the corresponding elements (1+4, 2+5, 3+6)
vec1 * vec2 # Multiplies the corresponding elements
2. Logical Operations:
Vectors can be compared using logical operations.
R
vec1 > 2 # Checks which elements are greater than 2 (Returns a logical vector)
3. Sorting:
You can sort a vector using the sort() function.
R
sorted_vec <- sort(c(5, 1, 3, 9, 7))
print(sorted_vec) # Sorted vector: 1, 3, 5, 7, 9
R allows applying functions to vectors directly, which simplifies many data manipulation tasks.
Examples:
Sum:
R
sum(numeric_vector)
Mean:
R
mean(numeric_vector)
Standard Deviation:
R
Copy code
sd(numeric_vector)
Min and Max:
R
min(numeric_vector)
max(numeric_vector)
Vectors in R can contain missing values (NA), and you can handle them using functions like is.na().
Example:
Checking for Missing Values:
R
vec_with_na <- c(1, 2, NA, 4, 5)
is.na(vec_with_na) # Returns TRUE for the missing value (NA)
Removing NA Values:
R
vec_without_na <- na.omit(vec_with_na) # Removes the NA value
print(vec_without_na)
You can assign names to the elements of a vector for better clarity and access.
Example:
R
named_vector <- c(a = 1, b = 2, c = 3)
print(named_vector)
print(named_vector["a"]) # Access by name
Create: c()
Access: Using indices (e.g., vector[2])
Modify: Reassign values with indices (e.g., vector[1] <- 10)
Operations: Arithmetic and logical operations on vectors are element-wise.
Functions: Use functions like sum(), mean(), sd() for vectorized calculations.