Building my own Redis in Go - Part 1

Building my own Redis in Go - Part 1

Introduction

I've always wanted to understand how Redis works internally and how its features are built. So, I decided to create my own version of Redis, but in Go (Redis is actually built in C). The name "godisDB" came naturally when merging Go and Redis. You can find the source code here. I'm breaking these blogs into multiple parts since there are many features, and I can keep publishing blogs as I build them one at a time. This blog (Part 1) will cover the following things:

  1. Creating a basic TCP server in Go

  2. Parsing Redis Serialization Protocol (RESP)

We need to create a TCP server because Redis is essentially a TCP server running on a port and accepting requests from clients. Redis uses the "Redis Serialization Protocol (RESP)" to communicate with its clients, and clients should use the same protocol to talk to Redis. If we use the same protocol for our database, all existing Redis clients can communicate with godisDB without any issues.

Creating a basic TCP server in Go

Creating a TCP server in Go is fairly simple because there is a package called "net" that can do it for us in a few lines of code. Like all TCP servers, ours should listen for connections on a specific port. Since Redis runs on port 6379, let's use port 6369. Here's the code using the "net" package:

func main() {
    listener, err := net.Listen("tcp", ":6369")
    if err != nil {
        fmt.Println("error listening")
    }
    for {
        c, err := listener.Accept()
        if err != nil {
            fmt.Println("error accepting connection")
        }
        fmt.Println("client connected")
        go handleConnection(c)
    }
}

net.Listen allows the server to listen on the port, and listener.Accept is used to accept connections. This is called in a for loop so we can accept multiple clients. Then, I call the function handleConnection in a goroutine. A goroutine runs independently of the calling function. This is where godisDB differs from Redis because Redis is single-threaded and uses I/O Multiplexing and an event loop like Node.js to handle multiple clients. GodisDB handles each client in a goroutine, which runs independently. Now, let's look at how the handleConnection function works.

func handleConnection(conn net.Conn) {
    defer conn.Close()
    fmt.Printf("Client connected: %s\n", conn.RemoteAddr().String())
    for {
        var buf []byte = make([]byte, 50*1024)
        n, err := conn.Read(buf[:])
        if err != nil {
            conn.Close()
            fmt.Print("client disconnected with ", err.Error())
            break
        }
        commands, err := ParseRESP(conn, buf[:n])
        fmt.Print(commands)
        executeCommands(conn, commands, kv)
    }
}

This function reads input from the client in a connection, uses the ParseRESP function to parse the input, and then executes the commands. Let's look at what Redis Serialization Protocol (RESP) is and how it is parsed in the following sections.

Redis Serialization Protocol (RESP)

Redis Serialization Protocol (RESP) is a protocol used by Redis and its clients to communicate. It is simple to implement, fast to parse, and human-readable. In an RESP-serialized payload, the first byte identifies the type, and the content follows. \r\n (CRLF) is the terminator in RESP. Here's a picture from the Redis documentation showing the types and their identifying bytes..

A simple string looks like +OK\r\n.

A bulk string looks like $<length>\r\n<data>\r\n. For example, $5\r\nhello\r\n (where $ indicates a Bulk string and 5 is the length of the string "hello").

An array looks like *<number-of-elements>\r\n<element-1>...<element-n>. For example, *2\r\n$5\r\nhello\r\n$5\r\nworld\r\n (where * indicates an Array, 2 is the number of elements, followed by 2 bulk strings as elements).

Understanding Bulk strings and Arrays is essential for now. You can read more about RESP in the Redis docs here. Now, let's build our parser for RESP.

Parsing RESP

Remember that in the handleConnection function, we stored the client's input in a byte array (inputBuf). We copy it into a bytes.Buffer because the functions we use to parse this input will update the pointer internally. This way, we can keep reading the buffer without losing track of how much has already been read.

var buf *bytes.Buffer = bytes.NewBuffer([]byte)
buf.Write(inputBuf)

We can start by reading the first byte to identify the type, then write separate functions to read each type. Here's the code for this function named parseSingle.

func (rp *RESPParser) parseSingle() (interface{}, error) {
    b, err := rp.buf.ReadByte()
    if err != nil {
        return nil, err
    }
    switch b {
    case '+':
        return readSimpleString(rp.c, rp.buf)
    case '-':
        return readError(rp.c, rp.buf)
    case ':':
        return readInt64(rp.c, rp.buf)
    case '$':
        return readBulkString(rp.c, rp.buf)
    case '*':
        return readArray(rp.c, rp.buf, rp)
    }
    return nil, errors.New("Invalid input")
}

As mentioned earlier, the ReadByte function reads and returns the next byte from the buffer, moving the internal pointer forward by one byte. Now, let's focus on the readArray function because Redis clients send requests to the Redis server as an array of strings.

func readArray(buf *bytes.Buffer, rp *RESPParser) (interface{}, error) {
    count, err := readLength(buf)
    if err != nil {
        return nil, err
    }
    var elems []interface{} = make([]interface{}, count)
    for i := range elems {
        elem, err := rp.parseSingle()
        if err != nil {
            return nil, err
        }
        elems[i] = elem
    }
    return elems, nil
}

In this function, we first read the length of the array (The readLength function handles the first CRLF by moving the pointer to the start of the array) and then iterate over this length to read the contents inside the array. In the for loop, we call the parseSingle function again, but with an updated pointer since some bytes have already been read.

Now, if the input is something like *2\r\n$5\r\nhello\r\n$5\r\nworld\r\n, the parseSingle function first detects the type and calls the readArray function. Then, the readArray function calls the parseSingle function twice to read the two bulk strings. Let's look at the readBulkString function:

func readBulkString(buf *bytes.Buffer) (string, error) {
    len, err := readLength(buf)
    if err != nil {
        return "", err
    }

    bulkStr := make([]byte, len)
    _, err = buf.Read(bulkStr)
    if err != nil {
        return "", err
    }

    // moving buffer pointer by 2 for \r and \n
    buf.ReadByte()
    buf.ReadByte()

    return string(bulkStr), nil
}

In this function, we first read the length of the bulk string and then read the buffer up to that length (Again, the readLength function handles the first CRLF by moving the pointer to the start of the BulkString). Finally, we move the buffer pointer by 2 at the end to account for the final CRLF.

So, for the above example, the parseSingle function will return an array ["hello", "world"]. In the array of strings sent by the client, the first element is the command, and the following elements are the arguments. So here, "hello" is the command, and "world" is an argument. Let's define a struct to manage this.

type Command struct {
    Name string
    Args []string
}

The following code will help us pack the result from parseSingle function into this struct

value, err := rp.parseSingle()
if err != nil {
    return nil, err
}

var cmds = make([]Command, 0)
tokens, err := toArrayString(value.([]interface{}))
if err != nil {
    return nil, err
}
cmd := strings.ToUpper(tokens[0])
cmds = append(cmds, Command{
    Name: cmd,
    Args: tokens[1:],
})

This code takes the first argument and stores it as the Name, while the other elements are stored as Args. The final result for our example will be [{HELLO [world]}.

That's how we parse RESP-serialized inputs and store them as Redis commands to be executed.

Conclusion

In Part 2 of this blog series, I will discuss how the commands are executed and sent to the client. I was inspired by Dhravya's Radish and Arpit Bhayani's DiceDB before building my own Redis. You can check out their implementations if you are interested. Happy coding!