redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 08:38:27 -05:00

Author	SHA1	Message	Date
antirez	c1c99e9f4e	PSYNC2: Fix the way replication info is saved/loaded from RDB. This commit attempts to fix a number of bugs reported in #4316. They are related to the way replication info like replication ID, offsets, and currently selected DB in the master client, are stored and loaded by Redis. In order to avoid inconsistencies the changes in this commit try to enforce that: 1. Replication information are only stored when the RDB file is generated by a slave that has a valid 'master' client, so that we can always extract the currently selected DB. 2. When replication informations are persisted in the RDB file, all the info for a successful PSYNC or nothing is persisted. 3. The RDB replication informations are only loaded if the instance is configured as a slave, otherwise a master can start with IDs that relate to a different history of the data set, and stil retain such IDs in the future while receiving unrelated writes.	2017-09-19 23:03:39 +02:00
antirez	365dd037dc	RDB modules values serialization format version 2. The original RDB serialization format was not parsable without the module loaded, becuase the structure was managed only by the module itself. Moreover RDB is a streaming protocol in the sense that it is both produce di an append-only fashion, and is also sometimes directly sent to the socket (in the case of diskless replication). The fact that modules values cannot be parsed without the relevant module loaded is a problem in many ways: RDB checking tools must have loaded modules even for doing things not involving the value at all, like splitting an RDB into N RDBs by key or alike, or just checking the RDB for sanity. In theory module values could be just a blob of data with a prefixed length in order for us to be able to skip it. However prefixing the values with a length would mean one of the following: 1. To be able to write some data at a previous offset. This breaks stremaing. 2. To bufferize values before outputting them. This breaks performances. 3. To have some chunked RDB output format. This breaks simplicity. Moreover, the above solution, still makes module values a totally opaque matter, with the fowllowing problems: 1. The RDB check tool can just skip the value without being able to at least check the general structure. For datasets composed mostly of modules values this means to just check the outer level of the RDB not actually doing any checko on most of the data itself. 2. It is not possible to do any recovering or processing of data for which a module no longer exists in the future, or is unknown. So this commit implements a different solution. The modules RDB serialization API is composed if well defined calls to store integers, floats, doubles or strings. After this commit, the parts generated by the module API have a one-byte prefix for each of the above emitted parts, and there is a final EOF byte as well. So even if we don't know exactly how to interpret a module value, we can always parse it at an high level, check the overall structure, understand the types used to store the information, and easily skip the whole value. The change is backward compatible: older RDB files can be still loaded since the new encoding has a new RDB type: MODULE_2 (of value 7). The commit also implements the ability to check RDB files for sanity taking advantage of the new feature.	2017-06-27 13:19:16 +02:00
antirez	2669fb8364	PSYNC2: different improvements to Redis replication. The gist of the changes is that now, partial resynchronizations between slaves and masters (without the need of a full resync with RDB transfer and so forth), work in a number of cases when it was impossible in the past. For instance: 1. When a slave is promoted to mastrer, the slaves of the old master can partially resynchronize with the new master. 2. Chained slalves (slaves of slaves) can be moved to replicate to other slaves or the master itsef, without requiring a full resync. 3. The master itself, after being turned into a slave, is able to partially resynchronize with the new master, when it joins replication again. In order to obtain this, the following main changes were operated: * Slaves also take a replication backlog, not just masters. * Same stream replication for all the slaves and sub slaves. The replication stream is identical from the top level master to its slaves and is also the same from the slaves to their sub-slaves and so forth. This means that if a slave is later promoted to master, it has the same replication backlong, and can partially resynchronize with its slaves (that were previously slaves of the old master). * A given replication history is no longer identified by the `runid` of a Redis node. There is instead a `replication ID` which changes every time the instance has a new history no longer coherent with the past one. So, for example, slaves publish the same replication history of their master, however when they are turned into masters, they publish a new replication ID, but still remember the old ID, so that they are able to partially resynchronize with slaves of the old master (up to a given offset). * The replication protocol was slightly modified so that a new extended +CONTINUE reply from the master is able to inform the slave of a replication ID change. * REPLCONF CAPA is used in order to notify masters that a slave is able to understand the new +CONTINUE reply. * The RDB file was extended with an auxiliary field that is able to select a given DB after loading in the slave, so that the slave can continue receiving the replication stream from the point it was disconnected without requiring the master to insert "SELECT" statements. This is useful in order to guarantee the "same stream" property, because the slave must be able to accumulate an identical backlog. * Slave pings to sub-slaves are now sent in a special form, when the top-level master is disconnected, in order to don't interfer with the replication stream. We just use out of band "\n" bytes as in other parts of the Redis protocol. An old design document is available here: https://gist.github.com/antirez/ae068f95c0d084891305 However the implementation is not identical to the description because during the work to implement it, different changes were needed in order to make things working well.	2016-11-09 15:37:15 +01:00
antirez	3dc84c5300	Modules: API to save/load single precision floating point numbers. When double precision is not needed, to take 2x space in the serialization is not good.	2016-10-03 00:08:35 +02:00
antirez	543e25efa6	RDB AOF preamble: WIP 4 (Mixed RDB/AOF loading).	2016-08-11 15:42:28 +02:00
antirez	4426cb11e2	RDB AOF preamble: WIP 1.	2016-08-09 11:07:32 +02:00
antirez	8ec28002be	Modules: support for modules native data types.	2016-06-03 18:14:04 +02:00
antirez	27e5f385c1	RDB v8: fix rdbLoadLen() return value.	2016-06-01 20:18:28 +02:00
antirez	e6554bed92	RDB v8: new ZSET storage format with binary doubles.	2016-06-01 12:12:26 +02:00
antirez	4aae4f7d35	RDB v8: ability to save uint64_t lengths.	2016-06-01 11:35:47 +02:00
antirez	32f80e2f1b	RDMF: More consistent define names.	2015-07-27 14:37:58 +02:00
antirez	cef054e868	RDMF (Redis/Disque merge friendlyness) refactoring WIP 1.	2015-07-26 15:17:18 +02:00
Matt Stancliff	f704360462	Improve RDB type correctness It's possible large objects could be larger than 'int', so let's upgrade all size counters to ssize_t. This also fixes rdbSaveObject serialized bytes calculation. Since entire serializations of data structures can be large, so we don't want to limit their calculated size to a 32 bit signed max. This commit increases object size calculation and cascades the change back up to serializedlength printing. Before: 127.0.0.1:6379> debug object hihihi ... encoding:quicklist serializedlength:-2147483559 ... After: 127.0.0.1:6379> debug object hihihi ... encoding:quicklist serializedlength:2147483737 ...	2015-01-19 14:10:12 -05:00
antirez	206cd219b6	RDB AUX fields support. This commit introduces a new RDB data type called 'aux'. It is used in order to insert inside an RDB file key-value pairs that may serve different needs, without breaking backward compatibility when new informations are embedded inside an RDB file. The contract between Redis versions is to ignore unknown aux fields when encountered. Aux fields can be used in order to: 1. Augment the RDB file with info like version of Redis that created the RDB file, creation time, used memory while the RDB was created, and so forth. 2. Add state about Redis inside the RDB file that we need to reload later: replication offset, previos master run ID, in order to improve failovers safety and allow partial resynchronization after a slave restart. 3. Anything that we may want to add to RDB files without breaking the ability of past versions of Redis to load the file.	2015-01-08 09:52:55 +01:00
antirez	e8614a1a77	New RDB v7 opcode: RESIZEDB. The new opcode is an hint about the size of the dataset (keys and number of expires) we are going to load for a given Redis database inside the RDB file. Since hash tables are resized accordingly ASAP, useless rehashing is avoided, speeding up load times significantly, in the order of ~ 20% or more for larger data sets. Related issue: #1719	2015-01-08 09:52:47 +01:00
Matt Stancliff	101b3a6e42	Convert quicklist RDB to store ziplist nodes Turns out it's a huge improvement during save/reload/migrate/restore because, with compression enabled, we're compressing 4k or 8k chunks of data consisting of multiple elements in one ziplist instead of compressing series of smaller individual elements.	2015-01-02 11:16:09 -05:00
antirez	75f0cd6520	Diskless replication: RDB -> slaves transfer draft implementation.	2014-10-14 10:11:29 +02:00
guiquanz	9d09ce3981	Fixed many typos.	2013-01-19 10:59:44 +01:00
antirez	4365e5b2d3	BSD license added to every C source and header file.	2012-11-08 18:31:32 +01:00
antirez	b16e423430	No longer used macro rdbIsOpcode() removed.	2012-10-30 19:10:46 +01:00
Alex Mitrofanov	51857c7e5c	Fixed RESTORE hash failure (Issue #532 ) (additional commit notes by antirez@gmail.com): The rdbIsObjectType() macro was not updated when the new RDB object type of ziplist encoded hashes was added. As a result RESTORE, that uses rdbLoadObjectType(), failed when a ziplist encoded hash was loaded. This does not affected normal RDB loading because in that case we use the lower-level function rdbLoadType(). The commit also adds a regression test.	2012-06-02 10:24:27 +02:00
Grisha Trubetskoy	5a86ab4799	Add a 24bit integer to ziplists to save one byte for ints that can fit in 24 bits (thanks to antirez for catching and solving the two's compliment bug). Increment REDIS_RDB_VERSION to 6	2012-04-24 12:02:19 +02:00
antirez	82e32055d8	RDB files now embed a crc64 checksum. Version of RDB bumped to 5.	2012-04-09 22:40:41 +02:00
antirez	d0ace5a314	Write RDB magic using a REDIS_RDB_VERSION define that is defined inside rdb.h	2012-03-31 17:08:40 +02:00
Pieter Noordhuis	ebd85e9a45	Encode small hashes with a ziplist	2012-01-02 22:14:10 -08:00
antirez	dab5332f95	Fixed a few typos	2011-11-09 21:59:27 +01:00
antirez	7dcc10b65e	Initial support for key expire times with millisecond resolution. RDB version is now 3, new opcoded added for high resolution times. Redis is still able to correctly load RDB version 2. Tests passing but still a work in progress. API to specify milliseconds expires still missing, but the precision of normal expires is now already improved and working.	2011-11-09 16:51:19 +01:00
Pieter Noordhuis	221782ccc6	Move rdbLoad* to top; update comments	2011-05-13 23:24:19 +02:00
Pieter Noordhuis	f1d8e4968e	Make RDB types/opcodes explicit; load/save object type	2011-05-13 22:14:39 +02:00
Pieter Noordhuis	2e4b0e7727	Abstract file/buffer I/O to support in-memory serialization	2011-05-13 17:31:00 +02:00

30 Commits