Chess Breakdown
The code is listed below but take a look at some of the visuals I have included in the markdown:
The code is listed below but take a look at some of the visuals I have included in the markdown:
devtools::install_github("JaseZiv/chessR")
library(chessR)
library(dplyr)
library(ggplot2)
library(stringr)
library(lubridate)
```
# Pulling my data with the chessR package, username: rainybeater123
```{r}
rainybeater <- get_raw_chessdotcom(usernames = "rainybeater123")
rainybeater <- rainybeater %>%
mutate(
result = case_when(
White == "rainybeater123" & Result == "1-0" ~ "Win",
Black == "rainybeater123" & Result == "0-1" ~ "Win",
Result == "1/2-1/2" ~ "Draw",
TRUE ~ "Loss"
))
```
## Looking at total win rate and then breaking it down by ECO (openings)
```{r}
summary <- rainybeater %>%
group_by(result) %>%
summarise(count = n()) %>%
mutate(win_rate = count / sum(count) * 100)
ecowin <- rainybeater %>%
group_by(ECO) %>%
summarise(
total_games = n(),
wins = sum(result == "Win"),
draws = sum(result == "Draw"),
losses = sum(result == "Loss"),
win_rate = (wins / total_games) * 100
) %>%
arrange(desc(win_rate))
head(ecowin)
```
# Joining the top and bottom 10 eco positions in the same dataframe
* I set the minimum number of games to 10 for each opening
```{r}
top_eco <- ecowin %>%
filter(total_games > 10) %>%
slice_max(win_rate, n = 10)
bot_eco <- ecowin %>%
filter(total_games > 10) %>%
slice_min(win_rate, n = 10)
topbot_eco <- rbind.data.frame(top_eco, bot_eco)
topbot_eco$top_or_bot <- 0
topbot_eco_1 <- topbot_eco %>%
mutate(top_or_bot = ifelse(win_rate > 50, "Win", "Loss"))
print(topbot_eco_1)
```
* Clearly, there are some openings that I should avoid and others that I tend to do better with
```{r, echo = FALSE}
ggplot(topbot_eco_1, aes(x = ECO, y = win_rate, fill = top_or_bot)) +
geom_bar(stat = "identity") +
labs(
title = paste("Top and Bottom Openings (ECO) for rainybeater123"),
x = "ECO Code",
y = "Win Rate (%)"
) +
facet_wrap(~ top_or_bot, scales = "free_x") +
theme_bw() +
theme(legend.position = "none")
```
# Setting up some new variables
* Determining if I went first or second
* Assigning my ELO score to each game
* Game duration
```{r}
first_second <- rainybeater %>%
mutate(first_second = ifelse(White %in% "rainybeater123", "first", "second"),
my_color = ifelse(White %in% "rainybeater123", "white", "black"),
my_elo = as.numeric(ifelse(my_color == "white", WhiteElo, BlackElo)),
Date = as.Date(Date),
StartTime = hms(StartTime),
EndTime = hms(EndTime),
Duration = as.numeric(EndTime - StartTime, units = "minutes")) %>%
filter(TimeClass %in% "rapid") %>%
filter(Duration > 0)
ecocount <- first_second %>%
group_by(result, ECO) %>%
mutate(ECO_count = n())
```
# My ELO Scoring Over Time
* I was on a pretty steady increase from March '21 to March '22
* I stayed pretty stagnant for the next year or so...
* Let's forget about March '23 to March '24!
* I have been getting back on track recently...
```{r, echo = FALSE}
ggplot(first_second) +
geom_line(aes(x = Date, y = my_elo)) +
labs(
title = "My Chess Elo Since 2021",
subtitle = "Split by 6 month increments, March 2021 - October 2024",
x = "Date",
y = "ELO"
) +
theme(
legend.position = "none"
) +
theme_bw() +
scale_x_date(date_breaks = "6 month", date_labels = "%m-%y")
```
# Barplots for wins and lossess by type of ending
```{r, echo = FALSE}
wintype <- first_second %>%
filter(result == "Win") %>%
mutate(wonhow = str_extract(Termination, "won by checkmate|won by resignation|won on time")) %>%
filter(!is.na(wonhow)) %>%
mutate(wonhow = as.factor(wonhow))
ggplot(wintype) +
geom_bar(aes(x = wonhow, fill = wonhow), position = "dodge") +
labs(
title = "Distribution of Chess Wins",
subtitle = "For all games that were not aborted",
x = "How I Won",
y = "Count",
fill = "How I Won"
) +
theme_bw() +
theme(
legend.position = "top"
)
losstype <- first_second %>%
filter(result == "Loss") %>%
mutate(losshow = str_extract(Termination, "won by checkmate|won by resignation|won on time")) %>%
filter(!is.na(losshow)) %>%
mutate(losshow = as.factor(losshow)) %>%
mutate(losshow = str_replace(losshow, "won by checkmate", "lost by checkmate"),
losshow = str_replace(losshow, "won by resignation", "lost by resignation"),
losshow = str_replace(losshow, "won on time", "lost on time"))
ggplot(losstype) +
geom_bar(aes(x = losshow, fill = losshow), position = "dodge") +
labs(
title = "Distribution of Chess Losses",
subtitle = "For all games that were not aborted. I have always been too stubborn to resign.",
x = "How I Lost",
y = "Count",
fill = "How I Lost"
) +
theme_bw() +
theme(
legend.position = "top"
)
```
# This is a distribution of results based on how long the game goes on
* The longer the games go, the more likely a draw becomes
* Some quick wins are usually due to opponent resignations
```{r, echo = FALSE}
ggplot(first_second) +
geom_density(aes(x = Duration, fill = result), alpha = .5) +
scale_fill_manual(values = c("Win" = "gold", "Loss" = "red", "Draw" = "blue")) + labs(
title = "Distribution of Time for Wins/Losses/Draws",
x = "Duration of games in minutes",
y = "Frequency"
) +
theme_bw()
# + facet_wrap(~result)
The big finding...the time series plot showing my ELO over time. It's all about consistency (Fall/Winter of '23 was brutal!)
This is a view at some of the best and worst openings based on win percentage. Clearly I have performed better with some openings over others!
I am probably not considered a traditional player who will resign if the outlook looks grim. Gotta fight to the end!
A lot of games end early once the queen is taken, but I good number of checkmate finishes in there.
Distribution of game duration based on outcome. Nothing too crazy surprising here.