mirror of https://codeberg.org/redict/redict.git synced 2025-01-22 16:18:28 -05:00

Go to file

Meir Shpilraien (Spielrein) 885f6b5ceb Redis Function Libraries (#10004 ) # Redis Function Libraries This PR implements Redis Functions Libraries as describe on: https://github.com/redis/redis/issues/9906. Libraries purpose is to provide a better code sharing between functions by allowing to create multiple functions in a single command. Functions that were created together can safely share code between each other without worrying about compatibility issues and versioning. Creating a new library is done using 'FUNCTION LOAD' command (full API is described below) This PR introduces a new struct called libraryInfo, libraryInfo holds information about a library: * name - name of the library * engine - engine used to create the library * code - library code * description - library description * functions - the functions exposed by the library When Redis gets the `FUNCTION LOAD` command it creates a new empty libraryInfo. Redis passes the `CODE` to the relevant engine alongside the empty libraryInfo. As a result, the engine will create one or more functions by calling 'libraryCreateFunction'. The new funcion will be added to the newly created libraryInfo. So far Everything is happening locally on the libraryInfo so it is easy to abort the operation (in case of an error) by simply freeing the libraryInfo. After the library info is fully constructed we start the joining phase by which we will join the new library to the other libraries currently exist on Redis. The joining phase make sure there is no function collision and add the library to the librariesCtx (renamed from functionCtx). LibrariesCtx is used all around the code in the exact same way as functionCtx was used (with respect to RDB loading, replicatio, ...). The only difference is that apart from function dictionary (maps function name to functionInfo object), the librariesCtx contains also a libraries dictionary that maps library name to libraryInfo object. ## New API ### FUNCTION LOAD `FUNCTION LOAD <ENGINE> <LIBRARY NAME> [REPLACE] [DESCRIPTION <DESCRIPTION>] <CODE>` Create a new library with the given parameters: * ENGINE - REPLACE Engine name to use to create the library. * LIBRARY NAME - The new library name. * REPLACE - If the library already exists, replace it. * DESCRIPTION - Library description. * CODE - Library code. Return "OK" on success, or error on the following cases: * Library name already taken and REPLACE was not used * Name collision with another existing library (even if replace was uses) * Library registration failed by the engine (usually compilation error) ## Changed API ### FUNCTION LIST `FUNCTION LIST [LIBRARYNAME <LIBRARY NAME PATTERN>] [WITHCODE]` Command was modified to also allow getting libraries code (so `FUNCTION INFO` command is no longer needed and removed). In addition the command gets an option argument, `LIBRARYNAME` allows you to only get libraries that match the given `LIBRARYNAME` pattern. By default, it returns all libraries. ### INFO MEMORY Added number of libraries to `INFO MEMORY` ### Commands flags `DENYOOM` flag was set on `FUNCTION LOAD` and `FUNCTION RESTORE`. We consider those commands as commands that add new data to the dateset (functions are data) and so we want to disallows to run those commands on OOM. ## Removed API * FUNCTION CREATE - Decided on https://github.com/redis/redis/issues/9906 * FUNCTION INFO - Decided on https://github.com/redis/redis/issues/9899 ## Lua engine changes When the Lua engine gets the code given on `FUNCTION LOAD` command, it immediately runs it, we call this run the loading run. Loading run is not a usual script run, it is not possible to invoke any Redis command from within the load run. Instead there is a new API provided by `library` object. The new API's: * `redis.log` - behave the same as `redis.log` * `redis.register_function` - register a new function to the library The loading run purpose is to register functions using the new `redis.register_function` API. Any attempt to use any other API will result in an error. In addition, the load run is has a time limit of 500ms, error is raise on timeout and the entire operation is aborted. ### `redis.register_function` `redis.register_function(<function_name>, <callback>, [<description>])` This new API allows users to register a new function that will be linked to the newly created library. This API can only be called during the load run (see definition above). Any attempt to use it outside of the load run will result in an error. The parameters pass to the API are: * function_name - Function name (must be a Lua string) * callback - Lua function object that will be called when the function is invokes using fcall/fcall_ro * description - Function description, optional (must be a Lua string). ### Example The following example creates a library called `lib` with 2 functions, `f1` and `f1`, returns 1 and 2 respectively: ``` local function f1(keys, args) return 1 end local function f2(keys, args) return 2 end redis.register_function('f1', f1) redis.register_function('f2', f2) ``` Notice: Unlike `eval`, functions inside a library get the KEYS and ARGV as arguments to the functions and not as global. ### Technical Details On the load run we only want the user to be able to call a white list on API's. This way, in the future, if new API's will be added, the new API's will not be available to the load run unless specifically added to this white list. We put the while list on the `library` object and make sure the `library` object is only available to the load run by using [lua_setfenv](https://www.lua.org/manual/5.1/manual.html#lua_setfenv) API. This API allows us to set the `globals` of a function (and all the function it creates). Before starting the load run we create a new fresh Lua table (call it `g`) that only contains the `library` API (we make sure to set global protection on this table just like the general global protection already exists today), then we use [lua_setfenv](https://www.lua.org/manual/5.1/manual.html#lua_setfenv) to set `g` as the global table of the load run. After the load run finished we update `g` metatable and set `__index` and `__newindex` functions to be `_G` (Lua default globals), we also pop out the `library` object as we do not need it anymore. This way, any function that was created on the load run (and will be invoke using `fcall`) will see the default globals as it expected to see them and will not have the `library` API anymore. An important outcome of this new approach is that now we can achieve a distinct global table for each library (it is not yet like that but it is very easy to achieve it now). In the future we can decide to remove global protection because global on different libraries will not collide or we can chose to give different API to different libraries base on some configuration or input. Notice that this technique was meant to prevent errors and was not meant to prevent malicious user from exploit it. For example, the load run can still save the `library` object on some local variable and then using in `fcall` context. To prevent such a malicious use, the C code also make sure it is running in the right context and if not raise an error.		2022-01-06 13:39:38 +02:00
.codespell	Setup dependabot for github-actions and codespell (#9857 )	2022-01-04 16:19:28 +02:00
.github	Setup dependabot for github-actions and codespell (#9857 )	2022-01-04 16:19:28 +02:00
deps	Added INFO LATENCYSTATS section: latency by percentile distribution/latency by cumulative distribution of latencies (#9462 )	2022-01-05 14:01:05 +02:00
src	Redis Function Libraries (#10004 )	2022-01-06 13:39:38 +02:00
tests	Redis Function Libraries (#10004 )	2022-01-06 13:39:38 +02:00
utils	Adds utils/gen-commands-json.py (#9958 )	2021-12-27 19:31:13 +02:00
.gitignore	Added INFO LATENCYSTATS section: latency by percentile distribution/latency by cumulative distribution of latencies (#9462 )	2022-01-05 14:01:05 +02:00
00-RELEASENOTES	Changes http to https in texts (#8495 )	2021-03-10 19:11:16 +02:00
BUGS	change references to the github repo location (#7479 )	2020-07-10 08:25:26 +03:00
CONDUCT	Adds code of conduct (#8471 )	2021-02-09 14:38:09 +02:00
CONTRIBUTING	Fixed some typos, add a spell check ci and others minor fix (#8890 )	2021-06-10 15:39:33 +03:00
COPYING	updated copyright year	2020-06-23 09:51:12 -07:00
INSTALL	INSTALL now redirects the user to README	2012-02-05 09:38:41 +01:00
Makefile	Fix `install` target on OSX (see #495 )	2012-05-15 11:18:50 +02:00
MANIFESTO	MANIFESTO: simplicity and lock-in.	2019-03-18 15:49:52 +01:00
README.md	Command table: Sorted subcommands (#9951 )	2021-12-16 12:54:40 +02:00
redis.conf	Added INFO LATENCYSTATS section: latency by percentile distribution/latency by cumulative distribution of latencies (#9462 )	2022-01-05 14:01:05 +02:00
runtest	Support tclsh 8.7 (#9500 )	2021-09-15 13:04:31 +03:00
runtest-cluster	Support tclsh 8.7 (#9500 )	2021-09-15 13:04:31 +03:00
runtest-moduleapi	Auto-generate the command table from JSON files (#9656 )	2021-12-15 21:23:15 +02:00
runtest-sentinel	Support tclsh 8.7 (#9500 )	2021-09-15 13:04:31 +03:00
SECURITY.md	Moved security bugs and vulnerability policy to SECURITY.md (#8938 )	2021-05-13 21:16:27 -07:00
sentinel.conf	Fix outdated protected-mode documentation in sentinel.conf (#9896 )	2021-12-08 11:25:56 +02:00
TLS.md	TLS: Session caching configuration support. (#7420 )	2020-07-10 11:33:47 +03:00

README.md

This README is just a fast quick start document. You can find more detailed documentation at redis.io.

What is Redis?

Redis is often referred to as a data structures server. What this means is that Redis provides access to mutable data structures via a set of commands, which are sent using a server-client model with TCP sockets and a simple protocol. So different processes can query and modify the same data structures in a shared way.

Data structures implemented into Redis have a few special properties:

Redis cares to store them on disk, even if they are always served and modified into the server memory. This means that Redis is fast, but that it is also non-volatile.
The implementation of data structures emphasizes memory efficiency, so data structures inside Redis will likely use less memory compared to the same data structure modelled using a high-level programming language.
Redis offers a number of features that are natural to find in a database, like replication, tunable levels of durability, clustering, and high availability.

Another good example is to think of Redis as a more complex version of memcached, where the operations are not just SETs and GETs, but operations that work with complex data types like Lists, Sets, ordered data structures, and so forth.

If you want to know more, this is a list of selected starting points:

Introduction to Redis data types. https://redis.io/topics/data-types-intro
Try Redis directly inside your browser. https://try.redis.io
The full list of Redis commands. https://redis.io/commands
There is much more inside the official Redis documentation. https://redis.io/documentation

Building Redis

Redis can be compiled and used on Linux, OSX, OpenBSD, NetBSD, FreeBSD. We support big endian and little endian architectures, and both 32 bit and 64 bit systems.

It may compile on Solaris derived systems (for instance SmartOS) but our support for this platform is best effort and Redis is not guaranteed to work as well as in Linux, OSX, and *BSD.

It is as simple as:

% make

To build with TLS support, you'll need OpenSSL development libraries (e.g. libssl-dev on Debian/Ubuntu) and run:

% make BUILD_TLS=yes

To build with systemd support, you'll need systemd development libraries (such as libsystemd-dev on Debian/Ubuntu or systemd-devel on CentOS) and run:

% make USE_SYSTEMD=yes

To append a suffix to Redis program names, use:

% make PROG_SUFFIX="-alt"

You can build a 32 bit Redis binary using:

% make 32bit

After building Redis, it is a good idea to test it using:

% make test

If TLS is built, running the tests with TLS enabled (you will need tcl-tls installed):

% ./utils/gen-test-certs.sh
% ./runtest --tls

Fixing build problems with dependencies or cached build options

Redis has some dependencies which are included in the deps directory. make does not automatically rebuild dependencies even if something in the source code of dependencies changes.

When you update the source code with git pull or when code inside the dependencies tree is modified in any other way, make sure to use the following command in order to really clean everything and rebuild from scratch:

make distclean

This will clean: jemalloc, lua, hiredis, linenoise.

Also if you force certain build options like 32bit target, no C compiler optimizations (for debugging purposes), and other similar build time options, those options are cached indefinitely until you issue a make distclean command.

Fixing problems building 32 bit binaries

If after building Redis with a 32 bit target you need to rebuild it with a 64 bit target, or the other way around, you need to perform a make distclean in the root directory of the Redis distribution.

In case of build errors when trying to build a 32 bit binary of Redis, try the following steps:

Install the package libc6-dev-i386 (also try g++-multilib).
Try using the following command line instead of make 32bit: make CFLAGS="-m32 -march=native" LDFLAGS="-m32"

Allocator

Selecting a non-default memory allocator when building Redis is done by setting the MALLOC environment variable. Redis is compiled and linked against libc malloc by default, with the exception of jemalloc being the default on Linux systems. This default was picked because jemalloc has proven to have fewer fragmentation problems than libc malloc.

To force compiling against libc malloc, use:

% make MALLOC=libc

To compile against jemalloc on Mac OS X systems, use:

% make MALLOC=jemalloc

Monotonic clock

By default, Redis will build using the POSIX clock_gettime function as the monotonic clock source. On most modern systems, the internal processor clock can be used to improve performance. Cautions can be found here: http://oliveryang.net/2015/09/pitfalls-of-TSC-usage/

To build with support for the processor's internal instruction clock, use:

% make CFLAGS="-DUSE_PROCESSOR_CLOCK"

Verbose build

Redis will build with a user-friendly colorized output by default. If you want to see a more verbose output, use the following:

% make V=1

Running Redis

To run Redis with the default configuration, just type:

% cd src
% ./redis-server

If you want to provide your redis.conf, you have to run it using an additional parameter (the path of the configuration file):

% cd src
% ./redis-server /path/to/redis.conf

It is possible to alter the Redis configuration by passing parameters directly as options using the command line. Examples:

% ./redis-server --port 9999 --replicaof 127.0.0.1 6379
% ./redis-server /etc/redis/6379.conf --loglevel debug

All the options in redis.conf are also supported as options using the command line, with exactly the same name.

Running Redis with TLS:

Please consult the TLS.md file for more information on how to use Redis with TLS.

Playing with Redis

You can use redis-cli to play with Redis. Start a redis-server instance, then in another terminal try the following:

% cd src
% ./redis-cli
redis> ping
PONG
redis> set foo bar
OK
redis> get foo
"bar"
redis> incr mycounter
(integer) 1
redis> incr mycounter
(integer) 2
redis>

You can find the list of all the available commands at https://redis.io/commands.

Installing Redis

In order to install Redis binaries into /usr/local/bin, just use:

% make install

You can use make PREFIX=/some/other/directory install if you wish to use a different destination.

Make install will just install binaries in your system, but will not configure init scripts and configuration files in the appropriate place. This is not needed if you just want to play a bit with Redis, but if you are installing it the proper way for a production system, we have a script that does this for Ubuntu and Debian systems:

% cd utils
% ./install_server.sh

Note: install_server.sh will not work on Mac OSX; it is built for Linux only.

The script will ask you a few questions and will setup everything you need to run Redis properly as a background daemon that will start again on system reboots.

You'll be able to stop and start Redis using the script named /etc/init.d/redis_<portnumber>, for instance /etc/init.d/redis_6379.

Code contributions

Note: By contributing code to the Redis project in any form, including sending a pull request via Github, a code fragment or patch via private email or public discussion groups, you agree to release your code under the terms of the BSD license that you can find in the COPYING file included in the Redis source distribution.

Please see the CONTRIBUTING file in this source distribution for more information. For security bugs and vulnerabilities, please see SECURITY.md.

Redis internals

If you are reading this README you are likely in front of a Github page or you just untarred the Redis distribution tar ball. In both the cases you are basically one step away from the source code, so here we explain the Redis source code layout, what is in each file as a general idea, the most important functions and structures inside the Redis server and so forth. We keep all the discussion at a high level without digging into the details since this document would be huge otherwise and our code base changes continuously, but a general idea should be a good starting point to understand more. Moreover most of the code is heavily commented and easy to follow.

Source code layout

The Redis root directory just contains this README, the Makefile which calls the real Makefile inside the src directory and an example configuration for Redis and Sentinel. You can find a few shell scripts that are used in order to execute the Redis, Redis Cluster and Redis Sentinel unit tests, which are implemented inside the tests directory.

Inside the root are the following important directories:

src: contains the Redis implementation, written in C.
tests: contains the unit tests, implemented in Tcl.
deps: contains libraries Redis uses. Everything needed to compile Redis is inside this directory; your system just needs to provide libc, a POSIX compatible interface and a C compiler. Notably deps contains a copy of jemalloc, which is the default allocator of Redis under Linux. Note that under deps there are also things which started with the Redis project, but for which the main repository is not redis/redis.

There are a few more directories but they are not very important for our goals here. We'll focus mostly on src, where the Redis implementation is contained, exploring what there is inside each file. The order in which files are exposed is the logical one to follow in order to disclose different layers of complexity incrementally.

Note: lately Redis was refactored quite a bit. Function names and file names have been changed, so you may find that this documentation reflects the unstable branch more closely. For instance, in Redis 3.0 the server.c and server.h files were named redis.c and redis.h. However the overall structure is the same. Keep in mind that all the new developments and pull requests should be performed against the unstable branch.

server.h

The simplest way to understand how a program works is to understand the data structures it uses. So we'll start from the main header file of Redis, which is server.h.

All the server configuration and in general all the shared state is defined in a global structure called server, of type struct redisServer. A few important fields in this structure are:

server.db is an array of Redis databases, where data is stored.
server.commands is the command table.
server.clients is a linked list of clients connected to the server.
server.master is a special client, the master, if the instance is a replica.

There are tons of other fields. Most fields are commented directly inside the structure definition.

Another important Redis data structure is the one defining a client. In the past it was called redisClient, now just client. The structure has many fields, here we'll just show the main ones:

struct client {
    int fd;
    sds querybuf;
    int argc;
    robj **argv;
    redisDb *db;
    int flags;
    list *reply;
    // ... many other fields ...
    char buf[PROTO_REPLY_CHUNK_BYTES];
}

The client structure defines a connected client:

The fd field is the client socket file descriptor.
argc and argv are populated with the command the client is executing, so that functions implementing a given Redis command can read the arguments.
querybuf accumulates the requests from the client, which are parsed by the Redis server according to the Redis protocol and executed by calling the implementations of the commands the client is executing.
reply and buf are dynamic and static buffers that accumulate the replies the server sends to the client. These buffers are incrementally written to the socket as soon as the file descriptor is writeable.

As you can see in the client structure above, arguments in a command are described as robj structures. The following is the full robj structure, which defines a Redis object:

typedef struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:LRU_BITS; /* lru time (relative to server.lruclock) */
    int refcount;
    void *ptr;
} robj;

Basically this structure can represent all the basic Redis data types like strings, lists, sets, sorted sets and so forth. The interesting thing is that it has a type field, so that it is possible to know what type a given object has, and a refcount, so that the same object can be referenced in multiple places without allocating it multiple times. Finally the ptr field points to the actual representation of the object, which might vary even for the same type, depending on the encoding used.

Redis objects are used extensively in the Redis internals, however in order to avoid the overhead of indirect accesses, recently in many places we just use plain dynamic strings not wrapped inside a Redis object.

server.c

This is the entry point of the Redis server, where the main() function is defined. The following are the most important steps in order to startup the Redis server.

initServerConfig() sets up the default values of the server structure.
initServer() allocates the data structures needed to operate, setup the listening socket, and so forth.
aeMain() starts the event loop which listens for new connections.

There are two special functions called periodically by the event loop:

serverCron() is called periodically (according to server.hz frequency), and performs tasks that must be performed from time to time, like checking for timed out clients.
beforeSleep() is called every time the event loop fired, Redis served a few requests, and is returning back into the event loop.

Inside server.c you can find code that handles other vital things of the Redis server:

call() is used in order to call a given command in the context of a given client.
activeExpireCycle() handles eviction of keys with a time to live set via the EXPIRE command.
performEvictions() is called when a new write command should be performed but Redis is out of memory according to the maxmemory directive.
The global variable redisCommandTable defines all the Redis commands, specifying the name of the command, the function implementing the command, the number of arguments required, and other properties of each command.

commands.c

This file is auto generated by utils/generate-command-code.py, the content is based on the JSON files in the src/commands folder. These are meant to be the single source of truth about the Redis commands, and all the metadata about them. These JSON files are not meant to be used directly by anyone directly, instead that metadata can be obtained via the COMMAND command.

networking.c

This file defines all the I/O functions with clients, masters and replicas (which in Redis are just special clients):

createClient() allocates and initializes a new client.
the addReply*() family of functions are used by command implementations in order to append data to the client structure, that will be transmitted to the client as a reply for a given command executed.
writeToClient() transmits the data pending in the output buffers to the client and is called by the writable event handler sendReplyToClient().
readQueryFromClient() is the readable event handler and accumulates data read from the client into the query buffer.
processInputBuffer() is the entry point in order to parse the client query buffer according to the Redis protocol. Once commands are ready to be processed, it calls processCommand() which is defined inside server.c in order to actually execute the command.
freeClient() deallocates, disconnects and removes a client.

aof.c and rdb.c

As you can guess from the names, these files implement the RDB and AOF persistence for Redis. Redis uses a persistence model based on the fork() system call in order to create a process with the same (shared) memory content of the main Redis process. This secondary process dumps the content of the memory on disk. This is used by rdb.c to create the snapshots on disk and by aof.c in order to perform the AOF rewrite when the append only file gets too big.

The implementation inside aof.c has additional functions in order to implement an API that allows commands to append new commands into the AOF file as clients execute them.

The call() function defined inside server.c is responsible for calling the functions that in turn will write the commands into the AOF.

db.c

Certain Redis commands operate on specific data types; others are general. Examples of generic commands are DEL and EXPIRE. They operate on keys and not on their values specifically. All those generic commands are defined inside db.c.

Moreover db.c implements an API in order to perform certain operations on the Redis dataset without directly accessing the internal data structures.

The most important functions inside db.c which are used in many command implementations are the following:

lookupKeyRead() and lookupKeyWrite() are used in order to get a pointer to the value associated to a given key, or NULL if the key does not exist.
dbAdd() and its higher level counterpart setKey() create a new key in a Redis database.
dbDelete() removes a key and its associated value.
emptyDb() removes an entire single database or all the databases defined.

The rest of the file implements the generic commands exposed to the client.

object.c

The robj structure defining Redis objects was already described. Inside object.c there are all the functions that operate with Redis objects at a basic level, like functions to allocate new objects, handle the reference counting and so forth. Notable functions inside this file:

incrRefCount() and decrRefCount() are used in order to increment or decrement an object reference count. When it drops to 0 the object is finally freed.
createObject() allocates a new object. There are also specialized functions to allocate string objects having a specific content, like createStringObjectFromLongLong() and similar functions.

This file also implements the OBJECT command.

replication.c

This is one of the most complex files inside Redis, it is recommended to approach it only after getting a bit familiar with the rest of the code base. In this file there is the implementation of both the master and replica role of Redis.

One of the most important functions inside this file is replicationFeedSlaves() that writes commands to the clients representing replica instances connected to our master, so that the replicas can get the writes performed by the clients: this way their data set will remain synchronized with the one in the master.

This file also implements both the SYNC and PSYNC commands that are used in order to perform the first synchronization between masters and replicas, or to continue the replication after a disconnection.

Script

The script unit is compose of 3 units

script.c - integration of scripts with Redis (commands execution, set replication/resp, ..)
script_lua.c - responsible to execute Lua code, uses script.c to interact with Redis from within the Lua code.
function_lua.c - contains the Lua engine implementation, uses script_lua.c to execute the Lua code.
functions.c - Contains Redis Functions implementation (FUNCTION command), uses functions_lua.c if the function it wants to invoke needs the Lua engine.
eval.c - Contains the eval implementation using script_lua.c to invoke the Lua code.

Other C files

t_hash.c, t_list.c, t_set.c, t_string.c, t_zset.c and t_stream.c contains the implementation of the Redis data types. They implement both an API to access a given data type, and the client command implementations for these data types.
ae.c implements the Redis event loop, it's a self contained library which is simple to read and understand.
sds.c is the Redis string library, check https://github.com/antirez/sds for more information.
anet.c is a library to use POSIX networking in a simpler way compared to the raw interface exposed by the kernel.
dict.c is an implementation of a non-blocking hash table which rehashes incrementally.
cluster.c implements the Redis Cluster. Probably a good read only after being very familiar with the rest of the Redis code base. If you want to read cluster.c make sure to read the Redis Cluster specification.

Anatomy of a Redis command

All the Redis commands are defined in the following way:

void foobarCommand(client *c) {
    printf("%s",c->argv[1]->ptr); /* Do something with the argument. */
    addReply(c,shared.ok); /* Reply something to the client. */
}

The command is then referenced inside server.c in the command table:

{"foobar",foobarCommand,2,"rtF",0,NULL,0,0,0,0,0},

In the above example 2 is the number of arguments the command takes, while "rtF" are the command flags, as documented in the command table top comment inside server.c.

After the command operates in some way, it returns a reply to the client, usually using addReply() or a similar function defined inside networking.c.

There are tons of command implementations inside the Redis source code that can serve as examples of actual commands implementations. Writing a few toy commands can be a good exercise to get familiar with the code base.

There are also many other files not described here, but it is useless to cover everything. We just want to help you with the first steps. Eventually you'll find your way inside the Redis code base :-)

Enjoy!