redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-23 16:48:27 -05:00

Author	SHA1	Message	Date
antirez	db7a5f23b4	CG: RDB saving part 2, consumers.	2018-03-15 12:54:10 +01:00
antirez	8fb6048ed0	CG: RDB saving part 1, metadata and PEL.	2018-03-15 12:54:10 +01:00
charsyam	76386c48b8	refactoring-make-condition-clear-for-rdb	2018-02-27 21:55:20 +09:00
Salvatore Sanfilippo	d8830200b4	Merge pull request #3828 from oranagra/sdsnewlen_pr add SDS_NOINIT option to sdsnewlen to avoid unnecessary memsets.	2018-02-27 04:04:32 -08:00
Oran Agra	60a4f12f8b	fix processing of large bulks (above 2GB) - protocol parsing (processMultibulkBuffer) was limitted to 32big positions in the buffer readQueryFromClient potential overflow - rioWriteBulkCount used int, although rioWriteBulkString gave it size_t - several places in sds.c that used int for string length or index. - bugfix in RM_SaveAuxField (return was 1 or -1 and not length) - RM_SaveStringBuffer was limitted to 32bit length	2017-12-29 12:24:19 +02:00
antirez	60d26acfc8	Refactoring: improve luaCreateFunction() API. The function in its initial form, and after the fixes for the PSYNC2 bugs, required code duplication in multiple spots. This commit modifies it in order to always compute the script name independently, and to return the SDS of the SHA of the body: this way it can be used in all the places, including for SCRIPT LOAD, without duplicating the code to create the Lua function name. Note that this requires to re-compute the body SHA1 in the case of EVAL seeing a script for the first time, but this should not change scripting performance in any way because new scripts definition is a rare event happening the first time a script is seen, and the SHA1 computation is anyway not a very slow process against the typical Redis script and compared to the actua Lua byte compiling of the body. Note that the function used to assert() if a duplicated script was loaded, however actually now two times over three, we want the function to handle duplicated scripts just fine: this happens in SCRIPT LOAD and in RDB AUX "lua" loading. Moreover the assert was not defending against some obvious failure mode, so now the function always tests against already defined functions at start.	2017-12-04 11:25:20 +01:00
antirez	65a9740fa8	Fix loading of RDB files lua AUX fields when the script is defined. In the case of slaves loading the RDB from master, or in other similar cases, the script is already defined, and the function registering the script should not fail in the assert() call.	2017-12-01 16:01:10 +01:00
antirez	f24d3a7de0	Streams: delta encode IDs based on key. Add count + deleted fields. We used to have the master ID stored at the start of the listpack, however using the key directly makes more sense in order to create a space efficient representation: anyway the key at the radix tree is very unlikely to change because of how the stream is implemented. Moreover on nodes merging, to rewrite the merged listpacks is anyway the most sensible operation, and we can use the iterator and the append-to-stream function in order to avoid re-implementing the code needed for merging. This commit also adds two items at the start of the listpack: the number of valid items inside the listpack, and the number of items marked as deleted. This means that there is no need to scan a listpack in order to understand if it's a good candidate for garbage collection, if the ration between valid/deleted items triggers the GC.	2017-12-01 10:24:24 +01:00
antirez	98d184db12	Streams: Save stream->length in RDB.	2017-12-01 10:24:24 +01:00
antirez	edd70c1993	Streams: RDB loading. RDB saving modified. After a few attempts it looked quite saner to just add the last item ID at the end of the serialized listpacks, instead of scanning the last listpack loaded from head to tail just to fetch it. It's a disk space VS CPU-and-simplicity tradeoff basically.	2017-12-01 10:24:24 +01:00
antirez	485014cc74	Streams: RDB saving.	2017-12-01 10:24:24 +01:00
antirez	452ad2e928	PSYNC2: just store script bodies into RDB. Related to #4483. As suggested by @soloestoy, we can retrieve the SHA1 from the body. Given that in the new implementation using AUX fields we ended copying around a lot to create new objects and strings, extremize such concept and trade CPU for space inside the RDB file.	2017-11-30 18:38:26 +01:00
antirez	f11a7585a8	PSYNC2: Save Lua scripts state into RDB file. This is currently needed in order to fix #4483, but this can be useful in other contexts, so maybe later we may want to remove the conditionals and always save/load scripts. Note that we are using the "lua" AUX field here, in order to guarantee backward compatibility of the RDB file. The unknown AUX fields must be discarded by past versions of Redis.	2017-11-30 18:37:52 +01:00
antirez	4d063bb6ba	PSYNC2: reorganize comments related to recent fixes. Related to PR #4412 and issue #4407.	2017-11-24 11:08:29 +01:00
Salvatore Sanfilippo	9d86ae4597	Merge pull request #4412 from soloestoy/bugfix-psync2 PSYNC2: safe free backlog when reach the time limit and others	2017-11-24 10:56:18 +01:00
zhaozhao.zz	ea2e51c630	PSYNC2: persist cached_master's dbid inside the RDB	2017-11-22 12:11:26 +08:00
zhaozhao.zz	93037f7642	PSYNC2: make repl_stream_db never be -1 it means that after this change all the replication info in RDB is valid, and it can distinguish us from the older version.	2017-11-22 12:05:34 +08:00
antirez	a1944c3e4d	Fix saving of zero-length lists. Normally in modern Redis you can't create zero-len lists, however it's possible to load them from old RDB files generated, for instance, using Redis 2.8 (see issue #4409). The "Right Thing" would be not loading such lists at all, but this requires to hook in rdb.c random places in a not great way, for a problem that is at this point, at best, minor. Here in this commit instead I just fix the fact that zero length lists, materialized as quicklists with the first node set to NULL, were iterated in the wrong way while they are saved, leading to a crash. The other parts of the list implementation are apparently able to deal with empty lists correctly, even if they are no longer a thing.	2017-11-06 12:37:03 +01:00
zhaozhao.zz	b8579c225c	PSYNC2: clarify the scenario when repl_stream_db can be -1	2017-11-02 10:45:33 +08:00
zhaozhao.zz	885c4f856e	PSYNC2 & RDB: fix the missing rdbSaveInfo for BGSAVE	2017-11-01 17:52:43 +08:00
antirez	bb3b5ddd19	PSYNC2: More refinements related to #4316 .	2017-09-20 11:28:13 +02:00
zhaozhao.zz	b541ccef25	PSYNC2: make persisiting replication info more solid This commit is a reinforcement of commit `c1c99e9`. 1. Replication information can be stored when the RDB file is generated by a mater using server.slaveseldb when server.repl_backlog is not NULL, or set repl_stream_db be -1. That's safe, because NULL server.repl_backlog will trigger full synchronization, then master will send SELECT command to replicaiton stream. 2. Only do rdbSave* when rsiptr is not NULL, if we do rdbSave* without rdbSaveInfo, slave will miss repl-stream-db. 3. Save the replication informations also in the case of SAVE command, FLUSHALL command and DEBUG reload.	2017-09-20 11:18:10 +02:00
antirez	c1c99e9f4e	PSYNC2: Fix the way replication info is saved/loaded from RDB. This commit attempts to fix a number of bugs reported in #4316. They are related to the way replication info like replication ID, offsets, and currently selected DB in the master client, are stored and loaded by Redis. In order to avoid inconsistencies the changes in this commit try to enforce that: 1. Replication information are only stored when the RDB file is generated by a slave that has a valid 'master' client, so that we can always extract the currently selected DB. 2. When replication informations are persisted in the RDB file, all the info for a successful PSYNC or nothing is persisted. 3. The RDB replication informations are only loaded if the instance is configured as a slave, otherwise a master can start with IDs that relate to a different history of the data set, and stil retain such IDs in the future while receiving unrelated writes.	2017-09-19 23:03:39 +02:00
antirez	fc7ecd8d35	AOF check utility: ability to check files with RDB preamble.	2017-07-10 13:38:23 +02:00
antirez	2b36950e9b	Free IO context if any in RDB loading code. Thanks to @oranagra for spotting this bug.	2017-07-06 11:20:49 +02:00
antirez	365dd037dc	RDB modules values serialization format version 2. The original RDB serialization format was not parsable without the module loaded, becuase the structure was managed only by the module itself. Moreover RDB is a streaming protocol in the sense that it is both produce di an append-only fashion, and is also sometimes directly sent to the socket (in the case of diskless replication). The fact that modules values cannot be parsed without the relevant module loaded is a problem in many ways: RDB checking tools must have loaded modules even for doing things not involving the value at all, like splitting an RDB into N RDBs by key or alike, or just checking the RDB for sanity. In theory module values could be just a blob of data with a prefixed length in order for us to be able to skip it. However prefixing the values with a length would mean one of the following: 1. To be able to write some data at a previous offset. This breaks stremaing. 2. To bufferize values before outputting them. This breaks performances. 3. To have some chunked RDB output format. This breaks simplicity. Moreover, the above solution, still makes module values a totally opaque matter, with the fowllowing problems: 1. The RDB check tool can just skip the value without being able to at least check the general structure. For datasets composed mostly of modules values this means to just check the outer level of the RDB not actually doing any checko on most of the data itself. 2. It is not possible to do any recovering or processing of data for which a module no longer exists in the future, or is unknown. So this commit implements a different solution. The modules RDB serialization API is composed if well defined calls to store integers, floats, doubles or strings. After this commit, the parts generated by the module API have a one-byte prefix for each of the above emitted parts, and there is a final EOF byte as well. So even if we don't know exactly how to interpret a module value, we can always parse it at an high level, check the overall structure, understand the types used to store the information, and easily skip the whole value. The change is backward compatible: older RDB files can be still loaded since the new encoding has a new RDB type: MODULE_2 (of value 7). The commit also implements the ability to check RDB files for sanity taking advantage of the new feature.	2017-06-27 13:19:16 +02:00
antirez	e498d9ee3e	Collect fork() timing info only if fork succeeded.	2017-05-19 11:10:36 +02:00
antirez	c33493277a	Clarify why we save ziplist elements in revserse order. Also get rid of variables that are now kinda redundant, since the dictionary iterator was removed. This is related to PR #3949.	2017-04-18 11:01:47 +02:00
spinlock	23ec36909e	rdb: saving skiplist in reversed order to accelerate the deserialisation process	2017-04-17 13:22:34 +08:00
oranagra	f86df924b0	add SDS_NOINIT option to sdsnewlen to avoid unnecessary memsets. this commit also contains small bugfix in rdbLoadLzfStringObject a bug that currently has no implications.	2017-02-23 03:04:08 -08:00
antirez	04542cff92	Replication: fix the infamous key leakage of writable slaves + EXPIRE. BACKGROUND AND USE CASEj Redis slaves are normally write only, however the supprot a "writable" mode which is very handy when scaling reads on slaves, that actually need write operations in order to access data. For instance imagine having slaves replicating certain Sets keys from the master. When accessing the data on the slave, we want to peform intersections between such Sets values. However we don't want to intersect each time: to cache the intersection for some time often is a good idea. To do so, it is possible to setup a slave as a writable slave, and perform the intersection on the slave side, perhaps setting a TTL on the resulting key so that it will expire after some time. THE BUG Problem: in order to have a consistent replication, expiring of keys in Redis replication is up to the master, that synthesize DEL operations to send in the replication stream. However slaves logically expire keys by hiding them from read attempts from clients so that if the master did not promptly sent a DEL, the client still see logically expired keys as non existing. Because slaves don't actively expire keys by actually evicting them but just masking from the POV of read operations, if a key is created in a writable slave, and an expire is set, the key will be leaked forever: 1. No DEL will be received from the master, which does not know about such a key at all. 2. No eviction will be performed by the slave, since it needs to disable eviction because it's up to masters, otherwise consistency of data is lost. THE FIX In order to fix the problem, the slave should be able to tag keys that were created in the slave side and have an expire set in some way. My solution involved using an unique additional dictionary created by the writable slave only if needed. The dictionary is obviously keyed by the key name that we need to track: all the keys that are set with an expire directly by a client writing to the slave are tracked. The value in the dictionary is a bitmap of all the DBs where such a key name need to be tracked, so that we can use a single dictionary to track keys in all the DBs used by the slave (actually this limits the solution to the first 64 DBs, but the default with Redis is to use 16 DBs). This solution allows to pay both a small complexity and CPU penalty, which is zero when the feature is not used, actually. The slave-side eviction is encapsulated in code which is not coupled with the rest of the Redis core, if not for the hook to track the keys. TODO I'm doing the first smoke tests to see if the feature works as expected: so far so good. Unit tests should be added before merging into the 4.0 branch.	2016-12-13 10:59:54 +01:00
Chris Lamb	6eb0c52d4c	src/rdb.c: Correct "whenver" -> "whenever" typo.	2016-12-01 13:16:30 +01:00
antirez	28c96d73b2	PSYNC2: Save replication ID/offset on RDB file. This means that stopping a slave and restarting it will still make it able to PSYNC with the master. Moreover the master itself will retain its ID/offset, in case it gets turned into a slave, or if a slave will try to PSYNC with it with an exactly updated offset (otherwise there is no backlog). This change was possible thanks to PSYNC v2 that makes saving the current replication state much simpler.	2016-11-10 12:35:29 +01:00
antirez	2669fb8364	PSYNC2: different improvements to Redis replication. The gist of the changes is that now, partial resynchronizations between slaves and masters (without the need of a full resync with RDB transfer and so forth), work in a number of cases when it was impossible in the past. For instance: 1. When a slave is promoted to mastrer, the slaves of the old master can partially resynchronize with the new master. 2. Chained slalves (slaves of slaves) can be moved to replicate to other slaves or the master itsef, without requiring a full resync. 3. The master itself, after being turned into a slave, is able to partially resynchronize with the new master, when it joins replication again. In order to obtain this, the following main changes were operated: * Slaves also take a replication backlog, not just masters. * Same stream replication for all the slaves and sub slaves. The replication stream is identical from the top level master to its slaves and is also the same from the slaves to their sub-slaves and so forth. This means that if a slave is later promoted to master, it has the same replication backlong, and can partially resynchronize with its slaves (that were previously slaves of the old master). * A given replication history is no longer identified by the `runid` of a Redis node. There is instead a `replication ID` which changes every time the instance has a new history no longer coherent with the past one. So, for example, slaves publish the same replication history of their master, however when they are turned into masters, they publish a new replication ID, but still remember the old ID, so that they are able to partially resynchronize with slaves of the old master (up to a given offset). * The replication protocol was slightly modified so that a new extended +CONTINUE reply from the master is able to inform the slave of a replication ID change. * REPLCONF CAPA is used in order to notify masters that a slave is able to understand the new +CONTINUE reply. * The RDB file was extended with an auxiliary field that is able to select a given DB after loading in the slave, so that the slave can continue receiving the replication stream from the point it was disconnected without requiring the master to insert "SELECT" statements. This is useful in order to guarantee the "same stream" property, because the slave must be able to accumulate an identical backlog. * Slave pings to sub-slaves are now sent in a special form, when the top-level master is disconnected, in order to don't interfer with the replication stream. We just use out of band "\n" bytes as in other parts of the Redis protocol. An old design document is available here: https://gist.github.com/antirez/ae068f95c0d084891305 However the implementation is not identical to the description because during the work to implement it, different changes were needed in order to make things working well.	2016-11-09 15:37:15 +01:00
antirez	152c1b6802	Module: Ability to get context from IO context. It was noted by @dvirsky that it is not possible to use string functions when writing the AOF file. This sometimes is critical since the command rewriting may need to be built in the context of the AOF callback, and without access to the context, and the limited types that the AOF production functions will accept, this can be an issue. Moreover there are other needs that we can't anticipate regarding the ability to use Redis Modules APIs using the context in order to build representations to emit AOF / RDB. Because of this a new API was added that allows the user to get a temporary context from the IO context. The context is auto released if obtained when the RDB / AOF callback returns. Calling multiple time the function to get the context, always returns the same one, since it is invalid to have more than a single context.	2016-10-06 17:09:26 +02:00
antirez	3dc84c5300	Modules: API to save/load single precision floating point numbers. When double precision is not needed, to take 2x space in the serialization is not good.	2016-10-03 00:08:35 +02:00
antirez	e565632e59	Child -> Parent pipe for COW info transferring.	2016-09-19 13:45:20 +02:00
antirez	945a2f948e	zmalloc: zmalloc_get_smap_bytes_by_field() modified to work for any PID. The goal is to get copy-on-write amount of the child from the parent.	2016-09-19 10:28:42 +02:00
antirez	3793afa0ba	Merge branch 'aofrdb' into unstable	2016-09-09 15:03:21 +02:00
antirez	57a0db9495	Fix rdb.c var types when calling rdbLoadLen(). Technically as soon as Redis 64 bit gets proper support for loading collections and/or DBs with more than 2^32 elements, the 32 bit version should be modified in order to check if what we read from rdbLoadLen() overflows. This would only apply to huge RDB files created with a 64 bit instance and later loaded into a 32 bit instance.	2016-09-01 11:08:44 +02:00
antirez	f1c32f0dcb	RDB AOF preamble: WIP 3 (RDB loading refactoring).	2016-08-11 15:27:29 +02:00
antirez	feda52381d	RDB AOF preamble: WIP 2.	2016-08-09 16:41:40 +02:00
antirez	4426cb11e2	RDB AOF preamble: WIP 1.	2016-08-09 11:07:32 +02:00
antirez	0a628e5102	Avoid simultaneous RDB and AOF child process. This patch, written in collaboration with Oran Agra (@oranagra) is a companion to `780a8b1`. Together the two patches should avoid that the AOF and RDB saving processes can be spawned at the same time. Previously conditions that could lead to two saving processes at the same time were: 1. When AOF is enabled via CONFIG SET and an RDB saving process is already active. 2. When the SYNC command decides to start an RDB saving process ASAP in order to serve a new slave that cannot partially resynchronize (but only if we have a disk target for replication, for diskless replication there is not such a problem). Condition "1" is not very severe but "2" can happen often and is definitely good at degrading Redis performances in an unexpected way. The two commits have the effect of always spawning RDB savings for replication in replicationCron() instead of attempting to start an RDB save synchronously. Moreover when a BGSAVE or AOF rewrite must be performed, they are instead just postponed using flags that will try to perform such operations ASAP. Finally the BGSAVE command was modified in order to accept a SCHEDULE option so that if an AOF rewrite is in progress, when this option is given, the command no longer returns an error, but instead schedules an RDB rewrite operation for when it will be possible to start it.	2016-07-21 18:35:01 +02:00
antirez	7e220a964a	In Redis RDB check: more details in error reportings.	2016-07-01 15:26:55 +02:00
antirez	e697153d18	In Redis RDB check: log decompression errors.	2016-07-01 11:59:25 +02:00
antirez	e9f31ba9c2	In Redis RDB check: better error reporting.	2016-07-01 09:36:52 +02:00
Pierre Chapuis	188d90fc87	fix some compiler warnings	2016-06-05 16:48:45 +02:00
antirez	8ec28002be	Modules: support for modules native data types.	2016-06-03 18:14:04 +02:00
antirez	27e5f385c1	RDB v8: fix rdbLoadLen() return value.	2016-06-01 20:18:28 +02:00

1 2 3 4 5

220 Commits