RRGS DEV - How to Calculate Conditional Probability in R?

How to Calculate Conditional Probability in R?

Conditional Probability

Conditional probability is a measure of the likelihood of an event occurring, given that another event has already occurred. It allows us to update our beliefs about the probability of an event based on new information. Conditional probability is written as P(B | A), which is read as “the probability of event B given event A.” It can be calculated using the following formula:

P(B | A) = P(A and B) / P(A)

where:

P(B | A) is the conditional probability of event B occurring given event A has occurred.
P(A and B) is the joint probability of both events A and B occurring together.
P(A) is the probability of event A occurring.

To better illustrate the concept, let’s consider an example:

Suppose we have a deck of 52 playing cards. We know that there are 13 hearts and 12 face cards in the deck. Let’s calculate the conditional probability of drawing a face card, given that the card is a heart.

First, we need to determine the probability of drawing a heart (event A). There are 13 hearts in the deck, so: P(A) = 13 / 52 = 1/4
Next, we need to find the joint probability of drawing a face card and a heart (event A and B). There are 3 face cards that are also hearts (the King, Queen, and Jack of hearts), so: P(A and B) = 3 / 52
Finally, we can calculate the conditional probability of drawing a face card given that the card is a heart (P(B | A)): P(B | A) = P(A and B) / P(A) = (3 / 52) / (1/4) = 3/13

So, the probability of drawing a face card given that the card is a heart is 3/13 or approximately 0.2308.

Calculate Conditional Probability in R

To calculate conditional probability in R, you can use the prop.table() function. Let’s assume you have a data frame with two variables (or columns) named A and B, and you want to find the conditional probability P(B | A). Here’s how to do it:

Create a contingency table (also known as a cross-tabulation or crosstab) using the table() function.
Convert the contingency table into a conditional probability table using the prop.table() function.

Here’s a step-by-step example:

# Sample data

data <- data.frame(

A = c("a1", "a1", "a1", "a2", "a2", "a2"),

B = c("b1", "b1", "b2", "b1", "b2", "b2")

)

# Create a contingency table

contingency_table <- table(data$A, data$B)

# Calculate the conditional probability table P(B | A)

conditional_probability_table <-

prop.table(contingency_table, margin = 1)

# Print the conditional probability table

print(conditional_probability_table)

The conditional_probability_table variable will now contain the conditional probabilities P(B | A) for all combinations of A and B. The margin = 1 argument in the prop.table() function indicates that the probabilities should be calculated by dividing each cell by the row sums (i.e., the probabilities are conditioned on the first variable, A).

If you want to find a specific conditional probability, like P(B=b1 | A=a1), you can access the corresponding cell in the conditional probability table:

probability_b1_given_a1 <- conditional_probability_table["a1", "b1"]

print(probability_b1_given_a1)

Remember to replace the sample data with your own dataset and variable names.

Example 2 – Cloudy Days

Let’s consider another example of calculating conditional probabilities using R. We’ll work with data related to the likelihood of rain given the presence of clouds.

First, let’s create a simple data frame with the information:

# Data frame with weather information

weather_data <- data.frame(

Cloudy = c("Yes", "Yes", "No", "No"),

Rain = c("Yes", "No", "Yes", "No"),

Frequency = c(30, 20, 10, 40)

)

This table represents the frequency of different weather conditions in a particular region:

Cloudy Rain Frequency

Yes Yes 30

Yes No 20

No Yes 10

No No 40

Now, let’s calculate the conditional probability of rain given the presence of clouds (P(Rain | Cloudy)):

# Total frequency of cloudy days

total_cloudy <-

sum(weather_data$Frequency[weather_data$Cloudy == "Yes"])

# Frequency of rainy days when it's cloudy

rainy_and_cloudy <-

weather_data$Frequency[weather_data$Cloudy == "Yes" &

weather_data$Rain == "Yes"]

# Conditional probability of rain given clouds

P_rain_given_cloudy <- rainy_and_cloudy / total_cloudy

P_rain_given_cloudy

In this example, the total frequency of cloudy days is 50 (30 + 20), and the frequency of rainy days when it’s cloudy is 30. The conditional probability of rain given clouds is 30 / 50 = 0.6 or 60%.

Example 3 – Student Information

Let’s consider another example using conditional probabilities in R. This time, we’ll work with data related to the likelihood of passing an exam given the attendance in a course.

First, let’s create a simple data frame with the information:

# Data frame with student information

student_data <- data.frame(

Attendance = c("High", "High", "Low", "Low"),

Pass = c("Yes", "No", "Yes", "No"),

Frequency = c(80, 20, 30, 70)

)

This table represents the frequency of different student outcomes in a particular course:

Attendance Pass Frequency

High Yes 80

High No 20

Low Yes 30

Low No 70

Now, let’s calculate the conditional probability of passing the exam given high attendance (P(Pass | High Attendance)):

# Total frequency of students with high attendance

total_high_attendance <-

sum(student_data$Frequency[student_data$Attendance == "High"])

# Frequency of students who pass the exam with high attendance

pass_and_high_attendance <-

student_data$Frequency[student_data$Attendance == "High" &

student_data$Pass == "Yes"]

# Conditional probability of passing the exam given high attendance

P_pass_given_high_attendance <-

pass_and_high_attendance / total_high_attendance

P_pass_given_high_attendance

In this example, the total frequency of students with high attendance is 100 (80 + 20), and the frequency of students who pass the exam with high attendance is 80. The conditional probability of passing the exam given high attendance is 80 / 100 = 0.8 or 80%.

How to Calculate Conditional Probability in R?

Conditional Probability

Calculate Conditional Probability in R

Example 2 – Cloudy Days

Example 3 – Student Information

Connect with RRGS

@2024 RRGS Inc. All rights reserved.