Introduction
I've always wanted to understand how Redis works internally and how its features are built. So, I decided to create my own version of Redis, but in Go (Redis is actually built in C). The name "godisDB" came naturally when merging Go and Redis. You can find the source code here. I'm breaking these blogs into multiple parts since there are many features, and I can keep publishing blogs as I build them one at a time. This blog (Part 1) will cover the following things:
Creating a basic TCP server in Go
Parsing Redis Serialization Protocol (RESP)
We need to create a TCP server because Redis is essentially a TCP server running on a port and accepting requests from clients. Redis uses the "Redis Serialization Protocol (RESP)" to communicate with its clients, and clients should use the same protocol to talk to Redis. If we use the same protocol for our database, all existing Redis clients can communicate with godisDB without any issues.
Creating a basic TCP server in Go
Creating a TCP server in Go is fairly simple because there is a package called "net" that can do it for us in a few lines of code. Like all TCP servers, ours should listen for connections on a specific port. Since Redis runs on port 6379, let's use port 6369. Here's the code using the "net" package:
func main() {
listener, err := net.Listen("tcp", ":6369")
if err != nil {
fmt.Println("error listening")
}
for {
c, err := listener.Accept()
if err != nil {
fmt.Println("error accepting connection")
}
fmt.Println("client connected")
go handleConnection(c)
}
}
net.Listen
allows the server to listen on the port, and listener.Accept
is used to accept connections. This is called in a for loop so we can accept multiple clients. Then, I call the function handleConnection
in a goroutine. A goroutine runs independently of the calling function. This is where godisDB differs from Redis because Redis is single-threaded and uses I/O Multiplexing and an event loop like Node.js to handle multiple clients. GodisDB handles each client in a goroutine, which runs independently. Now, let's look at how the handleConnection
function works.
func handleConnection(conn net.Conn) {
defer conn.Close()
fmt.Printf("Client connected: %s\n", conn.RemoteAddr().String())
for {
var buf []byte = make([]byte, 50*1024)
n, err := conn.Read(buf[:])
if err != nil {
conn.Close()
fmt.Print("client disconnected with ", err.Error())
break
}
commands, err := ParseRESP(conn, buf[:n])
fmt.Print(commands)
executeCommands(conn, commands, kv)
}
}
This function reads input from the client in a connection, uses the ParseRESP
function to parse the input, and then executes the commands. Let's look at what Redis Serialization Protocol (RESP) is and how it is parsed in the following sections.
Redis Serialization Protocol (RESP)
Redis Serialization Protocol (RESP) is a protocol used by Redis and its clients to communicate. It is simple to implement, fast to parse, and human-readable. In an RESP-serialized payload, the first byte identifies the type, and the content follows. \r\n
(CRLF) is the terminator in RESP. Here's a picture from the Redis documentation showing the types and their identifying bytes..
A simple string looks like +OK\r\n
.
A bulk string looks like $<length>\r\n<data>\r\n
. For example, $5\r\nhello\r\n
(where $
indicates a Bulk string and 5
is the length of the string "hello").
An array looks like *<number-of-elements>\r\n<element-1>...<element-n>
. For example, *2\r\n$5\r\nhello\r\n$5\r\nworld\r\n
(where *
indicates an Array, 2
is the number of elements, followed by 2 bulk strings as elements).
Understanding Bulk strings and Arrays is essential for now. You can read more about RESP in the Redis docs here. Now, let's build our parser for RESP.
Parsing RESP
Remember that in the handleConnection
function, we stored the client's input in a byte array (inputBuf
). We copy it into a bytes.Buffer
because the functions we use to parse this input will update the pointer internally. This way, we can keep reading the buffer without losing track of how much has already been read.
var buf *bytes.Buffer = bytes.NewBuffer([]byte)
buf.Write(inputBuf)
We can start by reading the first byte to identify the type, then write separate functions to read each type. Here's the code for this function named parseSingle
.
func (rp *RESPParser) parseSingle() (interface{}, error) {
b, err := rp.buf.ReadByte()
if err != nil {
return nil, err
}
switch b {
case '+':
return readSimpleString(rp.c, rp.buf)
case '-':
return readError(rp.c, rp.buf)
case ':':
return readInt64(rp.c, rp.buf)
case '$':
return readBulkString(rp.c, rp.buf)
case '*':
return readArray(rp.c, rp.buf, rp)
}
return nil, errors.New("Invalid input")
}
As mentioned earlier, the ReadByte
function reads and returns the next byte from the buffer, moving the internal pointer forward by one byte. Now, let's focus on the readArray
function because Redis clients send requests to the Redis server as an array of strings.
func readArray(buf *bytes.Buffer, rp *RESPParser) (interface{}, error) {
count, err := readLength(buf)
if err != nil {
return nil, err
}
var elems []interface{} = make([]interface{}, count)
for i := range elems {
elem, err := rp.parseSingle()
if err != nil {
return nil, err
}
elems[i] = elem
}
return elems, nil
}
In this function, we first read the length of the array (The readLength
function handles the first CRLF by moving the pointer to the start of the array) and then iterate over this length to read the contents inside the array. In the for loop, we call the parseSingle
function again, but with an updated pointer since some bytes have already been read.
Now, if the input is something like *2\r\n$5\r\nhello\r\n$5\r\nworld\r\n
, the parseSingle
function first detects the type and calls the readArray
function. Then, the readArray
function calls the parseSingle
function twice to read the two bulk strings. Let's look at the readBulkString function:
func readBulkString(buf *bytes.Buffer) (string, error) {
len, err := readLength(buf)
if err != nil {
return "", err
}
bulkStr := make([]byte, len)
_, err = buf.Read(bulkStr)
if err != nil {
return "", err
}
// moving buffer pointer by 2 for \r and \n
buf.ReadByte()
buf.ReadByte()
return string(bulkStr), nil
}
In this function, we first read the length of the bulk string and then read the buffer up to that length (Again, the readLength
function handles the first CRLF by moving the pointer to the start of the BulkString). Finally, we move the buffer pointer by 2 at the end to account for the final CRLF.
So, for the above example, the parseSingle
function will return an array ["hello", "world"]. In the array of strings sent by the client, the first element is the command, and the following elements are the arguments. So here, "hello" is the command, and "world" is an argument. Let's define a struct to manage this.
type Command struct {
Name string
Args []string
}
The following code will help us pack the result from parseSingle
function into this struct
value, err := rp.parseSingle()
if err != nil {
return nil, err
}
var cmds = make([]Command, 0)
tokens, err := toArrayString(value.([]interface{}))
if err != nil {
return nil, err
}
cmd := strings.ToUpper(tokens[0])
cmds = append(cmds, Command{
Name: cmd,
Args: tokens[1:],
})
This code takes the first argument and stores it as the Name, while the other elements are stored as Args. The final result for our example will be [{HELLO [world]}
.
That's how we parse RESP-serialized inputs and store them as Redis commands to be executed.
Conclusion
In Part 2 of this blog series, I will discuss how the commands are executed and sent to the client. I was inspired by Dhravya's Radish and Arpit Bhayani's DiceDB before building my own Redis. You can check out their implementations if you are interested. Happy coding!