Lab 1: Raft

Leader Election

Lab 1 Raft Consensus Protocol - Leader Election

Due: Oct 3, 2024, 23:59 PDT

Rule of Thumb

Always feel free to seek help by posting questions or concerns on Piazza or coming to the office hours. For complex problems, we encourage you to come during office hours so that you have dedicated time to help you solve issues.

Always start your lab as early as possible. This class's labs are relatively easy to make the first prototype, but it can be quite challenging and time-consuming to make it correct and bug-free.

Introduction

In the following series of labs, you’ll implement Raft, a replicated state machine protocol. This is the first lab in the series, and you will get familiar with the skeleton code and implement basic leader election, which works even with network disconnection. In the next lab, you will implement basic fault-tolerant log replication.

A replicated service achieves fault tolerance by storing complete copies of its state (i.e., data) on multiple replica servers. Replication allows the service to continue operating even if some of its servers experience failures (crashes or a broken or flaky network). The challenge is that failures may cause the replicas to hold differing copies of the data.

Raft manages a service’s state replicas, and in particular, it helps the service sort out what the correct state is after failures. Raft implements a replicated state machine. It organizes client requests into a sequence, called the log, and ensures that all the replicas agree on the contents of the log. Each replica executes the client requests in the log in the order they appear in the log, applying those requests to the replica’s local copy of the service’s state. Since all the live replicas see the same log contents, they all execute the same requests in the same order, and thus continue to have identical service states. If a server fails but later recovers, Raft takes care of bringing its log up to date. Raft will continue to operate as long as at least a majority of the servers are alive and can talk to each other. If there is no such majority, Raft will make no progress but will pick up where it left off as soon as a majority can communicate again.

In this lab, you’ll start to implement Raft as a C++ class with associated methods, meant to be used as a module in a larger service. A set of Raft instances talk to each other with RPC to maintain replicated logs. Your Raft interface will support an indefinite sequence of numbered commands, also called log entries. The entries are numbered with index numbers. The log entry with a given index will eventually be committed. At that point, your Raft should send the log entry to the larger service for it to execute.

Your Raft instances are only allowed to interact using gRPC. For example, different Raft instances are not allowed to share variables. Your code should not use files at all.

Objective

In this lab, you’ll implement the leader election part of the Raft design described in the extended paper. You WILL NOT implement: log replication, saving persistent state, cluster membership changes (Section 6), log compaction/snapshotting (Section 7).

Some general tips:

Start early. Although the amount of code isn’t large, getting it to work correctly will be challenging.
Read and understand the Raft paper and the Raft lecture notes before you start. Your implementation should follow the paper’s description closely, particularly Figure 2, since that’s what the tests expect.

Getting Started

Checking out code

Use the following link to get the assignment from GitHub Classroom.

https://classroom.github.com/a/rLMN1Msm

Initial Setup

First, make sure you are working on the right branch. Make sure you use lab-1-solution as the branch name.

$ git checkout -b lab-1-solution main
$ git push -u origin lab-1-solution

Before compiling, you need to prepare submodules and dependencies for the lab.

$ ./setup.sh

This setup script will clone googletest and spdlog as submodules into the labs. It assumes you already correctly installed gRPC in lab0. Therefore, it will skip the gRPC installation.

Toolchain Requirement

cmake: version 3.22.1
g++: version (Ubuntu 13.1.0-8ubuntu1~22.04) 13.1.0

Note:

The version is not a hard requirement, but your lab will be graded on a machine (Ubuntu 22.04.4 LTS) with the above toolchain.
The lab will be built with C++20 standards and language extension disabled. (-std=c++20 is used). Please avoid incompatible APIs. For more details, refer to the cmake files in the project. Please post online if you have any concerns or questions.
g++ >= 13.1.0 may be necessary, as some c++20 features are only available after this version. If you don't have g++ >= 13.1.0 on your machine, you can run ./install_gcc-13.sh script to install it. The script is only tested under Ubuntu 22.04 LTS, and it will make the default g++ file link point to g++-13. Keep this in mind in case it conflicts with your own settings.

Build the lab

This lab uses cmake to build the lab. All cmake configuration files are provided. You can build the lab by doing:

# cd to the project folder

$ mkdir -p build
$ cd build

$ cmake .. # configure

$ make -j$(nproc) # build

WARNING: You should not modify any build (cmake or make) files. You can do that for your own testing purposes, but the grader will use the unmodified build files. Make sure your code compiles correctly with the provided build files.

ARM Mac users (unsupported by staff):

This lab is tested on the x86_64 version of Linux (Ubuntu 22.04 LTS), but you may also try Ubuntu ARM. Please take a look at the Multipass setup on this website, which installs aarch64 Ubuntu for ARM Mac computers.

Quick Overview of the Lab Codebase Structure

app: the application to bootstrap the raft node
build: the build folder (all built binaries go here)
cmake: some cmake scripts to locate installed gRPC
generated: contains soft links to generated protobuf and gRPC files
inc: all header files
- common: some common files/utilities shared by raft and tester
- rafty: header files for raft implementation
  - impl: inline functions
- toolings: toolings for graders to use
integration_tests: integration tester
libs: 3rd-party libraries
proto: protobuf definitions
src: source code
unittests: any unit test goes here

All pink bold folders marked above are those in which you will add code files. Only files and subdirectories inside these folders will be copied to the grading program. Thus, you should only add/modify files inside these folders. You may add/modify other files for testing purposes, but they will not be used during grading.

You may add test cases into unittests folder if you'd like to implement your own test cases. The build script will build each .cpp file as a standalone binary. googletest and generated gRPC-related files will be automatically linked. As mentioned before, these files/codes won't be included during the grading.

The Code

Your Raft implementation will primarily go to folders src/ and inc/rafty. You will also need to modify proto/raft.proto file to define the gRPC communication interface. More specifically:

In inc/rafty/
- Put all header files you need into this folder. You may add any additional files or folders you deem desired.
- The raft instance interface is provided in the file inc/rafty/raft.hpp. This header file contains all the required definitions and signatures for the testing script to work correctly. Therefore, DO NOT change any signature/definition of provided functions or properties. Some functions are implemented, and properties are populated for you. Beyond that, you can add any additional properties or functions needed to complete this lab. For more detailed instructions, refer to the later section.
- Some inlinable implementations are provided in inc/rafty/impl/raft.ipp.
In src/
- Put all source files you need into this folder. You may add any additional files or folders you deem desired.
- Complete src/raft.cpp. DO NOT change any signature of the provided functions. Beyond that, you can add any additional codes needed to complete this lab. For more detailed instructions, refer to the later section.
In proto/
- Complete raft.proto file.
- DO NOT add any additional files to this directory.

Logging

A structured logger is supplied for you to use. The implementation is based on spdlog and in file inc/common/logger.hpp if you are interested in taking a look.

A logger instance has already been created in the Raft class; you can use it directly. DO NOT modify the logger declaration and initialization in the skeleton code.
- Declaration: std::unique_ptr<rafty::utils::logger> logger;
By default, the logger we provide will have both a console sink and a file sink. The logger will output logs to files (if not intentionally silenced) to the file logs/rafty_node_<node_number>.log.
When running the node or multinode application (details are discussed in the later section), the logger can be silenced or adjusted by setting the verbosity flag or directly tuning the spdlog's log level programmatically in the code.
If you want to use the logger beyond the Raft class, you should supply the node id so that the logs go to the right file. Take a look at the provided Raft class's constructor regarding how to create a logger with node id.

Required Functions

Now, we shall go over (as a quick overview) the expected functionalities of all functions that you need to implement or complete. More detailed step-by-step instructions are listed in the later section.

There are a few specific functions in raft.hpp and raft.cpp you need to implement.

void run();

Implement this member function.
Conceptually, it should start the clock and begin the timer. Before calling run(), the raft instance should stay put without actively sending out any message to peers. After calling run(), the raft instance should start operating, sending heartbeats regularly, and starting elections (when needed).
This function should be non-blocking. Namely, this function should immediately return instead of blocking the function.

State get_state() const;

Implement this member function
It returns a State (defined in inc/common/common.hpp) representing the current state of this raft instance.
Populate the field term with the current term of the raft instance.
Populate the field is_leader with the value of whether the raft instance considers itself the leader.

ProposalResult propose(const std::string& data);

We will cover this function in lab 2. You don't need to worry about it in lab 1.

You need to complete a line in Raft::start_server() in the file inc/rafty/impl/raft.ipp.

builder.RegisterService(nullptr); /* (replace nullptr) */

Complete this member function.
This function creates an RPC server on the machine by binding to the listening_addr. However, it looks for a pointer points to the gRPC handler class, which implements the handlers, e.g. AppendEntries, RequestVote.
You need to replace the nullptr with an actual pointer to the handler class.

You need to complete the proto files in the file proto/raft.proto.

syntax = "proto3";

package raftpb;

service RaftService {

// TODO: implement AppendEntries and RequestVote

}

Finish the protobuf definition in this file according to the raft extended paper.
Essentially, you need to define two RPC functions --- AppendEntries and RequestVote.
You also need to define the request and response for both functions.

Your first implementation may not be clean enough that you can easily reason about its correctness. Give yourself enough time to rewrite your implementation so that you can easily reason about its correctness. Subsequent labs will build on this lab, so it is important to do a good job on your implementation.

Other Requirements/Notes

All peer stubs are created for you in the std::unordered_map<uint64_t, RaftServiceStub> peers_, as a mapping from raft instance id to its stub. You can invoke a peer stub by using its ID, for example:

this->peers_[target_id]->AppendEntries(/* args */)

When invoking a remote RPC function, you MUST create a context using the provided create_context function and pass it into the RPC call.

Function signature (defined in the class Raft):

std::unique_ptr<grpc::ClientContext> create_context(uint64_t to) const;

Your remote RPC invocation should be something like this:

auto context = this->create_context(target_id);

grpc::Status status = this->peers_[target_id]->AppendEntries(&*context /* other args */);

The tester program will rely on this context to evaluate your implementation. Failing to use create_context to generate the context will result in failed test cases.

A mutex is defined in the Raft class as a protected property. You are free to use this mutex to protect concurrent data access. DO NOT delete this mutex. The mutex is defined as:

mutable std::mutex mtx;

For simulated network errors, all RPC requests will be intercepted and returned immediately with an error status. For each RPC request, you should check the grpc::Status it returned. For network errors, grpc::Status::ok() will be evaluated as false.
DO NOT use gRPC streaming in this series of labs.

Step-by-Step

You are recommended to do this lab in the following steps:

Part A - gRPC

proto

We need to communicate between raft instances via gRPC. Therefore, we first need to define the RPC interface in the proto file. In the proto/raft.proto, define AppendEntries and RequestVote handlers as described in the paper's Figure 2. You shall also add data structure (or message in protobuf's terms) for both the response and reply of both RPC handlers.

Hint: You may want to define the Entry message in the proto file and use this data structure in your code as the log entry to store.

After this, you can build the project; all generated files are listed in the folder generated.

gRPC handler

Implement a gRPC handler class that implements RaftService (defined in proto/raft.proto) and modify the builder.RegisterService(nullptr); line in the function Raft::start_server() in the file inc/rafty/impl/raft.ipp. You don't necessarily need to write any concrete codes for each handler function at this step. Overriding all required functions with proper signature should be sufficient.

After this, you should be able to run the node application correctly. Namely, you should successfully listen to the specified address and connect to all peers.

Part B - Heatbeart

You need to find a way to achieve the timer behavior to kick off the election and send out heartbeats periodically. The heartbeat interval and election timeout are two important values to set. They are crucial to the liveness and effectiveness of the entire raft cluster. Please design this carefully by following the paper's discussion.

In this part, you need to implement void run() function to start the main loop of your raft instance.

During the process, you can start to add properties and functions to the Raft class. You may modify the constructor's implementation (not the signature).

Part C - Election

You may have already touched upon the leader election in the last part. In this part, You need to implement the RequestVote RPC handler according to the paper's discussion. It is very important to correctly update the current raft state when the instance is elected as leader. After this, you need to implement State get_state() const function to report the state of the raft instance.

Hint: having a dedicated data structure to keep track of the voting process might be helpful.

Your implementation should tolerate network issues, such as disconnection or partition.

Expectations

Election should be done fairly fast when no network failure is presented. Ideally, when there is no network failure, the raft cluster should elect the leader successfully in 0.5-1s.
Election should resist network failure, such as long delays, random drops, disconnections, partitions, etc...
Election should be stable. When there is no network failure, the elected leader should stay as elected indefinitely. Leader change in the stable network is not a hard-error, but problematic. It will cause extra rounds of RPC to elect a new leader and may significantly delay the log replication process in the following labs. If your implementation has re-election in a stable network, please investigate and fix it.

Be Caution & Hints

The tests for upcoming labs may fail your code if it runs too slowly. Try to write efficient code.
Do not write loops that execute continuously without pausing; it will slow your implementation enough to fail tests. Sleep the coroutine/thread when possible to avoid spinning and burning all CPU resources.
If you constantly see multiple elections happen almost simultaneously or elections trigger frequently, even when servers are all live and stable, you may want to adjust the random number for the timer and heartbeat frequency. (Think about why?)
Be aware of outdated PRC replies. Due to (simulated) network failure, the packet might be delayed. Handling delayed replies is crucial to ensure all server states remain correct.

Run and Test

We offer two helper applications to help you run and test each raft instance as a standalone process, which is how a real distributed system should be deployed. Two applications are node and multinode (built in the folder build/app/). node helps you to run a single raft instance, and multinode starts multiple nodes. multinode should be sufficient for your testing purpose, but if you want a finer control, you may find node helpful. More details:

node

A node is a wrapper app that uses your raft implementation and starts the raft instance with the provided listening IP address and peer addresses. You can start and test your implementation by running this node app. The node app won't start correctly until you implement the logic of creating a gRPC server.

The node app takes several arguments. The detailed descriptions of the arguments are shown below:

- - Type: unsigned int
  - Required: yes
  - Description:
    - The unique identifier for the underlying raft instance. This id should be unique across the cluster. Otherwise, the cluster won't be able to tell who is who.

port

- - Type: unsigned int
  - Required: yes
  - Description:
    - The port is used to listen to incoming network messages (as a server). The node program listens to 0.0.0.0:<PORT>.

peers

- - Type: string
  - Required: yes
  - Description:
    - The peers string represents all IP addresses that are reachable from the current machine. The raft instance will connect all peers according to the provided addresses. Each peer is separated by a comma. The id for the corresponding raft peer should also be supplied. The format will be like --peers 1+localhost:50051,2+localhost:50052, which indicates two peers with id 1 on localhost:50051 and id 2 on localhost:50052.
    - NO SPACE CAN BE USED WITHIN THE STRING.

fail_type

- - Type: unsigned int
  - Required: no
  - Description: The type of network failure to simulate when using dis and conn commands. Possible values are 0 and 1.
    - 0: network disconnection. Disconnected nodes will not be able to receive or send any message to other peers.
    - 1: network partition. Disconnect nodes will not be able to receive or send any message to connected nodes. However, all disconnected nodes can talk to each other without errors.
  - Default: 0

verbosity

- - Type: unsigned int
  - Required: no
  - Description: The verbose level. Possible values are 0, 1, and 2. Refer to the logging section regarding the built-in logger.
    - 0: silent
    - 1: disable all console logging, with file logging enabled
    - 2: enable both console logging and file logging
  - Default: 1

A typical three-node setup on the local machine would be like:

# first terminal

$ ./node --id 0 --port 50050 --peers 1+localhost:50051,2+localhost:50052

# second terminal

$ ./node --id 1 --port 50051 --peers 0+localhost:50050,2+localhost:50052

# third terminal

$ ./node --id 2 --port 50052 --peers 0+localhost:50050,1+localhost:50051

After starting up the node app, you will have four commands to control the raft instance. They are

r (run raft),
dis <id1{, id2}> (disconnect a node)
- conn <id1{, id2}> (connect a node)
k (kill raft)

Invoke r to run the raft node. Repeat the command for all nodes in the cluster. You can disconnect or connect a node by using dis 1 or dis 1 2 (multiple nodes at a time). To disconnect or connect nodes, you must invoke the command on all nodes in the cluster.

multinode

A multinode is a wrapper app that uses node to start multiple raft instances in a row with automatically generated IPs. This app is suitable for testing on local machines. You can start and test your implementation by running this multinode app. The node app won't start correctly until you implement the logic of creating a gRPC server.

The multinode app takes several arguments. The detailed descriptions of the arguments are shown below:

bin

- - Type: string
  - Required: no
  - Description:
    - The path to the node app.
  - Default: "./node"

num

- - Type: unsigned int
  - Required: no
  - Description:
    - The number of raft instances (>=3) to start.
  - Default: 3

fail_type

- - Type: unsigned int
  - Required: no
  - Description: The type of network failure to simulate when using dis and conn commands. Possible values are 0 and 1.
    - 0: network disconnection. Disconnected nodes will not be able to receive or send any message to other peers.
    - 1: network partition. Disconnect nodes will not be able to receive or send any message to connected nodes. However, all disconnected nodes can talk to each other without errors.
  - Default: 0

verbosity

- - Type: unsigned int
  - Required: no
  - Description: The verbose level. Possible values are 0 and 1. You cannot use console logging here. Refer to the logging section regarding the built-in logger.
    - 0: silent
    - 1: disable all console logging, with file logging enabled
  - Default: 1

A typical three-node setup on the local machine would be like:

$ ./multinode --num 3

After starting up the node app, you will have the same four commands (same semantics and usage) as the node app. However, you don't need to manually input commands to all raft instances, as multinode will forward commands for you.

Pre-Grade

We will not provide the test cases for the raft labs. You will need to rely on logs to verify the correctness of your implementation. However, you may be unsure about your implementation. Thus, we offer limited pre-grading for each group.

You will have 5 trials for this lab.

Pre-grade will be open via the announcement. When pre-grade is open, you will receive a secret key via email to all your group members. You need to populate the submitter/env file with your secret key.

For example:

# in file: submitter/env

SECRET=your_actual_secret_key

PROJ_ROOT=../

After correctly setting up the secret, you can run ./submit.sh to kick off the auto-grader. You will see a feedback like this:

Uploading /Users/ybyan/src/raft-lab-solution/pack.tar.gz

Using secret: xxxxxx

After this one, you will have 2 trial(s) left.

Setting up testbed...

Setting up repository...

Building codes...

Running testers...

Note: Google Test filter = *A

[==========] Running 3 tests from 1 test suite.

[----------] Global test environment set-up.

[----------] 3 tests from RaftTest

[ RUN ] RaftTest.InitialElectionA

[ OK ] RaftTest.InitialElectionA (3094 ms)

[ RUN ] RaftTest.ReElectionA

[ OK ] RaftTest.ReElectionA (4984 ms)

[ RUN ] RaftTest.ManyElectionA

[ OK ] RaftTest.ManyElectionA (4081 ms)

[----------] 3 tests from RaftTest (12160 ms total)

[----------] Global test environment tear-down

[==========] 3 tests from 1 test suite ran. (12160 ms total)

[ PASSED ] 3 tests.

You will only have a limited number of trials, so use them carefully and wisely.

Submit

Leave your latest modification and implementation on the "lab-1-solution" branch. Please do not push any commits or changes after the official deadline. The grader will check out the latest commit before the deadline for the grading.

DO NOT

Make sure you do not do any of the following; otherwise, you will fail the labs, even if the code may pass the testing scripts.

Change/create files other than the files allowed.
Use shared variables between Raft instances.
Use file/network/IPC interface other than the gRPC interface.
Use external libraries other than googletest and spdlog.

Acknowledgment
This lab is inspired by MIT 6.824. The codebase is re-designed to a C++ version.

Page updated

Report abuse