redict

mirror of https://codeberg.org/redict/redict.git synced 2025-01-22 16:18:28 -05:00

Author	SHA1	Message	Date
antirez	32efd9adf8	Client side caching: config option for table fill rate.	2019-07-24 11:35:01 +02:00
antirez	c98e7717bb	Client side caching: show tracking slots usage in INFO.	2019-07-23 11:02:14 +02:00
antirez	9268493e8d	Client side caching: implement full slot limit function.	2019-07-23 10:57:22 +02:00
Madelyn Olson	7d21754710	Hide HELLO and AUTH from slowlog and monitor	2019-07-22 22:53:15 -07:00
Oran Agra	3b6aeea44c	Implement module api for aux data in rdb Other changes: * fix memory leak in error handling of rdb loading of type OBJ_MODULE	2019-07-22 21:15:33 +03:00
antirez	c41f94d2a3	Client side caching: split invalidation into key / slot.	2019-07-22 18:59:53 +02:00
Oran Agra	d7d028a7a7	Allow modules to handle RDB loading errors. This is especially needed in diskless loading, were a short read could have caused redis to exit. now the module can handle the error and return to the caller gracefully. this fixes #5326	2019-07-21 18:19:32 +03:00
Oran Agra	56258c6b7d	Module API for Forking * create module API for forking child processes. * refactor duplicate code around creating and tracking forks by AOF and RDB. * child processes listen to SIGUSR1 and dies exitFromChild in order to eliminate a valgrind warning of unhandled signal. * note that BGSAVE error reply has changed. valgrind error is: Process terminating with default action of signal 10 (SIGUSR1)	2019-07-17 16:40:24 +03:00
zhaozhao.zz	6191ea90a1	Client side caching: implement trackingInvalidateKeysOnFlush()	2019-07-17 20:33:52 +08:00
Oran Agra	2de544cfcc	diskless replication on slave side (don't store rdb to file), plus some other related fixes The implementation of the diskless replication was currently diskless only on the master side. The slave side was still storing the received rdb file to the disk before loading it back in and parsing it. This commit adds two modes to load rdb directly from socket: 1) when-empty 2) using "swapdb" the third mode of using diskless slave by flushdb is risky and currently not included. other changes: -------------- distinguish between aof configuration and state so that we can re-enable aof only when sync eventually succeeds (and not when exiting from readSyncBulkPayload after a failed attempt) also a CONFIG GET and INFO during rdb loading would have lied When loading rdb from the network, don't kill the server on short read (that can be a network error) Fix rdb check when performed on preamble AOF tests: run replication tests for diskless slave too make replication test a bit more aggressive Add test for diskless load swapdb	2019-07-08 15:37:48 +03:00
Salvatore Sanfilippo	722446510f	Merge pull request #6116 from AngusP/scan-types SCAN: New Feature `SCAN cursor [TYPE type]` modifier suggested in issue #6107	2019-07-08 12:53:34 +02:00
Angus Pearson	6eb52e200c	Change typeNameCanonicalize -> getObjectTypeName, and other style changes	2019-07-08 11:04:37 +01:00
antirez	6b29f2d83d	Client side caching: RESP2 support.	2019-07-05 12:24:28 +02:00
antirez	46edb55de9	Client side caching: implement trackingInvalidateKey().	2019-07-03 19:16:20 +02:00
antirez	506764b3f8	Client side caching: hook inside call() for tracking.	2019-07-03 12:42:16 +02:00
antirez	c29f3bcf2e	Client side caching: enable tracking mode.	2019-06-30 06:19:08 -04:00
antirez	45d64f229e	Client side caching: fields and flags for tracking mode.	2019-06-29 20:08:41 -04:00
Angus Pearson	38cd5fd9f6	Spelling cannonical -> canonical	2019-06-13 17:49:33 +01:00
Angus Pearson	e2adea2188	Add char* typeNameCanonicalize(robj) to remove duplicate code between SCAN and TYPE commands, and to keep OBJ_ enum to string canonicalization in one place.	2019-06-10 17:41:44 +01:00
Oran Agra	09f99c2a92	make redis purge jemalloc after flush, and enable background purging thread jemalloc 5 doesn't immediately release memory back to the OS, instead there's a decaying mechanism, which doesn't work when there's no traffic (no allocations). this is most evident if there's no traffic after flushdb, the RSS will remain high. 1) enable jemalloc background purging 2) explicitly purge in flushdb	2019-06-02 15:33:14 +03:00
Salvatore Sanfilippo	e633254ccf	Merge pull request #6053 from soloestoy/enhance-aof-fsync-everysec aof: enhance AOF_FSYNC_EVERYSEC, more details in #5985	2019-05-10 18:06:40 +02:00
antirez	1c0c436757	Threaded IO: ability to disable reads from threaded path.	2019-05-06 18:02:51 +02:00
antirez	6ab6a97fe6	Threaded IO: parsing WIP 2: refactoring to parse from thread.	2019-05-06 18:02:51 +02:00
antirez	63a0ffd36a	Threaded IO: read side WIP 3.	2019-05-06 18:02:51 +02:00
antirez	dd5b105c73	Threaded IO: read side WIP.	2019-05-06 18:02:51 +02:00
antirez	9814b2a5f3	Threaded IO: make num of I/O threads configurable.	2019-05-06 18:02:51 +02:00
Ubuntu	9bf7f302a7	Threaded IO: stop threads when no longer needed + C11 in Makefile. Now threads are stopped even when the connections drop immediately to zero, not allowing the networking code to detect the condition and stop the threads. serverCron() will handle that.	2019-05-06 18:02:51 +02:00
antirez	f468e653b5	Threaded IO: implement handleClientsWithPendingWritesUsingThreads(). This is just an experiment for now, there are a couple of race conditions, mostly harmless for the performance gain experiment that this commit represents so far. The general idea here is to take Redis single threaded and instead fan-out on expansive kernel calls: write(2) in this case, but the same concept could be easily implemented for read(2) and protcol parsing. However just threading writes like in this commit, is enough to evaluate if the approach is sounding.	2019-05-06 18:02:51 +02:00
zhaozhao.zz	bcac165fab	aof: enhance AOF_FSYNC_EVERYSEC, more details in #5985	2019-04-29 14:38:28 +08:00
Itamar Haber	52686f4866	Adds a "Modules" section to `INFO` Fixes #6012. As long as "INFO is broken", this should be adequate IMO. Once we rework `INFO`, perhaps into RESP3, this implementation should be revisited.	2019-04-16 22:16:12 +03:00
Oran Agra	acba2fc9b4	slave corrupts replication stream when module blocked client uses large reply (or POSTPONED_ARRAY) when redis appends the blocked client reply list to the real client, it didn't bother to check if it is in fact the master client. so a slave executing that module command will send replies to the master, causing the master to send the slave error responses, which will mess up the replication offset (slave will advance it's replication offset, and the master does not)	2019-03-24 14:17:37 +02:00
Salvatore Sanfilippo	5e8caca036	Merge pull request #5944 from yossigo/command-filtering Command Filtering API	2019-03-22 17:43:49 +01:00
Salvatore Sanfilippo	122f42844a	Merge pull request #5945 from dvirsky/miss_notification Added keyspace miss notifications support	2019-03-22 17:41:00 +01:00
Dvir Volk	bc269c85e1	remove extra linebreak	2019-03-21 12:48:37 +02:00
Dvir Volk	99c2fe0bcf	added special flag for keyspace miss notifications	2019-03-21 11:47:14 +02:00
Yossi Gottlieb	c3e187190b	Initial command filter experiment.	2019-03-18 13:50:34 +02:00
Yossi Gottlieb	a88264d934	Add RedisModule_GetKeyNameFromIO().	2019-03-15 10:23:27 +02:00
Salvatore Sanfilippo	0cce98f2f9	Merge pull request #5834 from guybe7/trim_module_sds Trim SDS free space of retained module strings	2019-03-14 12:41:31 +01:00
antirez	6fd4af1f12	Merge branch 'sharedapi' into unstable	2019-03-14 11:24:48 +01:00
Yuan Zhou	feb4ebff45	server.h: remove dead code hashTypeTryObjectEncoding() is not used now	2019-03-07 18:38:27 +08:00
Salvatore Sanfilippo	88b720672c	Merge pull request #5877 from vattezhang/unstable_sentinel_cmd fix: fix sentinel command table and new flags format	2019-02-27 15:45:03 +01:00
vattezhang	9d632230b6	fix: fix sentinel command table and new flags format	2019-02-27 21:35:58 +08:00
antirez	a7780f716e	Merge branch 'gopher' into unstable	2019-02-25 18:16:58 +01:00
antirez	3b420034bb	RESP3: allow HELLO to be used with version = 2.	2019-02-25 16:41:00 +01:00
antirez	e00b22e090	Gopher: initial request handling.	2019-02-21 23:13:08 +01:00
antirez	3de9ccf190	Gopher: config setting to turn support on/off.	2019-02-21 17:28:53 +01:00
zhaozhao.zz	14507457a0	ACL: show categories in COMMAND reply Adding another new filed categories at the end of command reply, it's easy to read and distinguish flags and categories, also compatible with old format.	2019-02-14 00:13:01 +08:00
Guy Benoish	bdd9a8002a	Trim SDS free space of retained module strings In some cases processMultibulkBuffer uses sdsMakeRoomFor to expand the querybuf, but later in some cases it uses that query buffer as is for an argv element (see "Optimization"), which means that the sds in argv may have a lot of wasted space, and then in case modules keep that argv RedisString inside their data structure, this space waste will remain for long (until restarted from rdb).	2019-02-12 14:21:21 +01:00
zhaozhao.zz	ea9d3aefec	ACL: add masteruser configuration for replication In mostly production environment, normal user's behavior should be limited. Now in redis ACL mechanism we can do it like that: user default on +@all ~* -@dangerous nopass user admin on +@all ~* >someSeriousPassword Then the default normal user can not execute dangerous commands like FLUSHALL/KEYS. But some admin commands are in dangerous category too like PSYNC, and the configurations above will forbid replica from sync with master. Finally I think we could add a new configuration for replication, it is masteruser option, like this: masteruser admin masterauth someSeriousPassword Then replica will try AUTH admin someSeriousPassword and get privilege to execute PSYNC. If masteruser is NULL, replica would AUTH with only masterauth like before.	2019-02-12 17:12:37 +08:00
antirez	80f987726d	ACL: load ACL file at startup. Prevent silly configurations.	2019-02-07 17:20:03 +01:00
antirez	775bf6193d	ACL: implement rewriting of users in redis.conf.	2019-02-05 10:48:17 +01:00
antirez	500b3e128f	ACL: implement ACLLoadConfiguredUsers().	2019-02-04 16:35:15 +01:00
antirez	68fd4a97fa	ACL: better error reporting in users configuration errors.	2019-02-04 13:04:35 +01:00
antirez	b166c41edd	ACL: make ACLAppendUserForLoading() able to report bad argument.	2019-02-04 13:00:58 +01:00
antirez	21e84cdae2	ACL: initial appending of users in user loading list.	2019-02-04 12:55:48 +01:00
antirez	c7cd10dfe9	ACL: flags refactoring, function to describe user.	2019-01-31 16:49:22 +01:00
antirez	f99e0f59ef	ACL: populate category flags from command table.	2019-01-23 16:59:09 +01:00
antirez	91ec53ed13	ACL: define category names and flags.	2019-01-23 16:47:29 +01:00
antirez	70e541b7bc	ACL: better define name, and the idea of reserved ID.	2019-01-23 08:10:57 +01:00
antirez	711e514ea4	ACL: update comments in command flags.	2019-01-22 19:02:50 +01:00
antirez	4dc69497f5	Refactoring: always kill AOF/RDB child via helper functions.	2019-01-21 11:28:44 +01:00
antirez	c8391388c2	ACL: remove server.requirepass + some refactoring.	2019-01-18 11:49:30 +01:00
antirez	7b65605ab2	ACL: reimplement requirepass option in term of ACLs.	2019-01-17 18:05:43 +01:00
antirez	4a3419acfc	ACL: fix and improve ACL key checking.	2019-01-16 18:31:05 +01:00
antirez	cca64672f4	ACL: AUTH uses users. ACL WHOAMI implemented.	2019-01-15 18:16:20 +01:00
antirez	b39409bcf8	ACL: nopass user setting. This is needed in order to model the current behavior of authenticating the connection directly when no password is set. Now with ACLs this will be obtained by setting the default user as "nopass" user. Moreover this flag can be used in order to create other users that do not require any password but will work with "AUTH username <any-password>".	2019-01-15 13:16:31 +01:00
antirez	7aea02fa87	ACL: initial implementation of the ACL command.	2019-01-15 09:36:12 +01:00
antirez	a2e376ba52	ACL: ACLCheckCommandPerm() implementation WIP.	2019-01-14 18:35:21 +01:00
antirez	a0a4fb85ff	ACL: Fix compilation by adding prototype and c->cmd fix.	2019-01-14 13:22:56 +01:00
antirez	2da2e452ab	ACL: ACLLCOMMAND flags.	2019-01-14 13:21:21 +01:00
antirez	aced0328e3	ACL: avoid a radix tree lookup for the default user.	2019-01-11 11:32:41 +01:00
antirez	6bb6a6d3a8	ACL: implement ACLCreateUser().	2019-01-10 17:01:12 +01:00
antirez	29c88a9ce5	ACL: initialization function.	2019-01-10 16:39:32 +01:00
antirez	4278104acc	ACL: add a reference to the user in each client.	2019-01-10 16:34:13 +01:00
antirez	4729f71495	ACL: improved version of the user structure.	2019-01-10 12:47:52 +01:00
antirez	7fc882c578	ACL: use a fixed table for command IDs.	2019-01-09 21:31:29 +01:00
antirez	91f1d8026b	ACL: introduce the concept of command ID.	2019-01-09 17:20:47 +01:00
antirez	b43d70df56	ACL: refactoring of the original authentication code.	2019-01-09 17:00:30 +01:00
antirez	709a6612eb	RESP3: addReplyString() -> addReplyProto(). The function naming was totally nuts. Let's fix it as we break PRs anyway with RESP3 refactoring and changes.	2019-01-09 17:00:30 +01:00
antirez	e291170385	RESP3: verbatim reply API + DEBUG PROTOCOL support.	2019-01-09 17:00:30 +01:00
antirez	809e3a44a7	RESP3: addReplyBool() implemented.	2019-01-09 17:00:29 +01:00
antirez	4f0860cbfd	RESP3: initial implementation of the HELLO command.	2019-01-09 17:00:29 +01:00
antirez	3fd78f41e8	RESP3: restore the concept of null array for RESP2 compat.	2019-01-09 17:00:29 +01:00
antirez	2ad6e875ba	RESP3: add shared.nullarray for better RESP2 compat.	2019-01-09 17:00:29 +01:00
antirez	1a17cdfadf	RESP3: addReplyNullArray() added for better RESP2 compat.	2019-01-09 17:00:29 +01:00
antirez	317f8b9d38	RESP3: most null replies converted.	2019-01-09 17:00:29 +01:00
antirez	1b7298e66a	RESP3: addReplyNull() added.	2019-01-09 17:00:29 +01:00
antirez	fc9a3de97d	RESP3: remove other pointless shared object.	2019-01-09 17:00:29 +01:00
antirez	b7e8b734c9	RESP3: remove certain constants to spot places to fix.	2019-01-09 17:00:29 +01:00
antirez	13966522ea	RESP3: bring RESP2 compatibility to previous changes.	2019-01-09 17:00:29 +01:00
antirez	e14aabf936	RESP3: addReply*Len() support for RESP2 backward comp.	2019-01-09 17:00:29 +01:00
antirez	1ac6926647	RESP3: put RESP version in the client structure.	2019-01-09 17:00:29 +01:00
antirez	073293693e	RESP3: Use new deferred len API in server.c.	2019-01-09 17:00:29 +01:00
antirez	57c5a766a2	RESP3: Aggregate deferred lengths functions.	2019-01-09 17:00:29 +01:00
antirez	27f6e9bb9b	Modules shared API: initial core functions. Based on ideas and code in PR #5560 by @MeirShpilraien.	2018-12-20 17:57:35 +01:00
antirez	850b64c116	Revert shared APIs to modify the design.	2018-12-20 17:56:38 +01:00
MeirShpilraien	ab37289fa6	added module ability to register api to be used by other modules	2018-12-20 17:55:18 +01:00
antirez	129f2d2746	freeMemoryIfNeeded() small refactoring. Related to issue #5686 and PR #5689.	2018-12-12 11:37:15 +01:00
antirez	274531396c	Reject EXEC containing write commands against RO replica. Thanks to @soloestoy for discovering this issue in #5667. This is an alternative fix in order to avoid both cycling the clients and also disconnecting clients just having valid read-only transactions pending.	2018-12-11 11:39:21 +01:00
Oran Agra	b587c54c24	fix #5580 , display fragmentation and rss overhead bytes as signed these metrics become negative when RSS is smaller than the used_memory. This can easily happen when the program allocated a lot of memory and haven't written to it yet, in which case the kernel doesn't allocate any pages to the process	2018-12-02 15:29:20 +02:00
antirez	e3446fea9e	Streams: XSTREAM SETID -> XSETID. Keep vanilla stream commands at toplevel, see #5426.	2018-10-16 13:17:14 +02:00
Salvatore Sanfilippo	af09df08d7	Merge pull request #5426 from soloestoy/feature-xstream Bugfix data inconsistency after aof rewrite, and add XSTREAM command.	2018-10-16 13:10:36 +02:00
antirez	c9d9ae7baa	Fix propagation of consumer groups last ID. Issue #5433.	2018-10-10 12:51:02 +02:00
antirez	69c30965eb	Introduce protectClient() + some refactoring. The idea is to have an API for the cases like -BUSY state and DEBUG RELOAD where we have to manually deinstall the read handler. See #4804.	2018-10-09 13:15:41 +02:00
zhaozhao.zz	ec511fa709	Streams: add a new command XTREAM XSTREAM CREATE <key> <id or *> -- Create a new empty stream. XSTREAM SETID <key> <id or $> -- Set the current stream ID.	2018-10-09 13:11:04 +08:00
antirez	744fe7f348	Module cluster flags: initial vars / defines added.	2018-09-19 11:20:52 +02:00
antirez	43385c4375	LOLWUT: wrap it into a proper command.	2018-09-12 11:34:10 +02:00
antirez	ef2c7a5bbb	Slave removal: SLAVEOF -> REPLICAOF. SLAVEOF is now an alias.	2018-09-11 15:32:28 +02:00
antirez	6c001bfc0d	Unblocked clients API refactoring. See #4418 .	2018-09-03 18:39:18 +02:00
antirez	3e7349fdaf	Make pending buffer processing safe for CLIENT_MASTER client. Related to #5305.	2018-09-03 18:17:31 +02:00
antirez	067647a783	Introduce repl_slave_ignore_maxmemory flag internally. Note: this breaks backward compatibility with Redis 4, since now slaves by default are exact copies of masters and do not try to evict keys independently.	2018-08-27 12:20:27 +02:00
Salvatore Sanfilippo	19880ab851	Merge pull request #5248 from soloestoy/rewrite-brpoplpush rewrite BRPOPLPUSH as RPOPLPUSH to propagate	2018-08-26 16:31:24 +02:00
zhaozhao.zz	8a1219d93b	block: rewrite BRPOPLPUSH as RPOPLPUSH to propagate	2018-08-14 20:58:58 +08:00
zhaozhao.zz	14c4ddb5a6	pipeline: do not sdsrange querybuf unless all commands processed This is an optimization for processing pipeline, we discussed a problem in issue #5229: clients may be paused if we apply `CLIENT PAUSE` command, and then querybuf may grow too large, the cost of memmove in sdsrange after parsing a completed command will be horrible. The optimization is that parsing all commands in queyrbuf , after that we can just call sdsrange only once.	2018-08-14 00:43:42 +08:00
Salvatore Sanfilippo	92b39a0abf	Merge pull request #5189 from soloestoy/refactor-dbOverwrite refactor dbOverwrite to make lazyfree work	2018-07-31 16:40:35 +02:00
antirez	5401fe7fb9	Introduce writeCommandsDeniedByDiskError().	2018-07-31 13:09:38 +02:00
zhaozhao.zz	fddeeae724	refactor dbOverwrite to make lazyfree work	2018-07-31 12:07:57 +08:00
antirez	c426d85c4c	Control dynamic HZ via server configuration.	2018-07-30 13:37:30 +02:00
antirez	4e9c30a6ca	Merge branch 'dynamic-hz' into unstable	2018-07-30 13:31:23 +02:00
Salvatore Sanfilippo	445a2a2b1b	Merge pull request #4883 from itamarhaber/lua_scripts-in-info-memory Adds memory information about the scripts' cache to INFO	2018-07-23 18:43:05 +02:00
antirez	b65ddfb16a	Dynamic HZ: adapt cron frequency to number of clients.	2018-07-23 14:21:04 +02:00
antirez	e6ea603ad3	Dynamic HZ: separate hz from the configured hz. This way we can remember what the user configured HZ is, but change the actual HZ dynamically if needed in the dynamic HZ feature implementation.	2018-07-23 14:13:58 +02:00
Itamar Haber	993716c351	Adds Lua overheads to MEMORY STATS, smartens the MEMORY DOCTOR	2018-07-22 21:16:00 +03:00
antirez	f9c84d6d39	Hopefully improve commenting of #5126 . Reading the PR gave me the opportunity to better specify what the code was doing in places where I was not immediately sure about what was going on. Moreover I documented the structure in server.h so that people reading the header file will immediately understand what the structure is useful for.	2018-07-16 17:56:54 +02:00
Oran Agra	bf680b6f8c	slave buffers were wasteful and incorrectly counted causing eviction A) slave buffers didn't count internal fragmentation and sds unused space, this caused them to induce eviction although we didn't mean for it. B) slave buffers were consuming about twice the memory of what they actually needed. - this was mainly due to sdsMakeRoomFor growing to twice as much as needed each time but networking.c not storing more than 16k (partially fixed recently in 237a38737). - besides it wasn't able to store half of the new string into one buffer and the other half into the next (so the above mentioned fix helped mainly for small items). - lastly, the sds buffers had up to 30% internal fragmentation that was wasted, consumed but not used. C) inefficient performance due to starting from a small string and reallocing many times. what i changed: - creating dedicated buffers for reply list, counting their size with zmalloc_size - when creating a new reply node from, preallocate it to at least 16k. - when appending a new reply to the buffer, first fill all the unused space of the previous node before starting a new one. other changes: - expose mem_not_counted_for_evict info field for the benefit of the test suite - add a test to make sure slave buffers are counted correctly and that they don't cause eviction	2018-07-16 16:43:42 +03:00
dejun.xdj	61f12973f7	Bugfix: PEL is incorrect when consumer is blocked using xreadgroup with NOACK option. Save NOACK option into client.blockingState structure.	2018-07-09 13:40:29 +02:00
antirez	81778d91bf	Cache timezone and daylight active flag for safer logging. With such information will be able to use a private localtime() implementation serverLog(), which does not use any locking and is both thread and fork() safe.	2018-07-04 16:45:00 +02:00
Jack Drogon	93238575f7	Fix typo	2018-07-03 18:19:46 +02:00
antirez	94b3ee6142	Clarify the pending_querybuf field of clients.	2018-07-03 13:25:41 +02:00
chendianqiang	cbb2ac0799	Merge branch 'unstable' into pending-querybuf	2018-07-03 10:07:26 +08:00
antirez	2edcafb35d	addReplySubSyntaxError() renamed to addReplySubcommandSyntaxError().	2018-07-02 18:49:34 +02:00
Salvatore Sanfilippo	bc6a004588	Merge pull request #4998 from itamarhaber/module_command_help Module command help	2018-07-02 18:46:56 +02:00
chendianqiang	7de1ada070	limit the size of pending-querybuf in masterclient	2018-07-01 14:43:53 +08:00
zhaozhao.zz	b9cbd04b57	clients: add type option for client list	2018-06-28 17:43:05 +08:00
antirez	fb39bfd7af	Take clients in a ID -> Client handle dictionary.	2018-06-27 14:08:42 +02:00
Guy Benoish	b5197f1fc9	Enhance RESTORE with RDBv9 new features RESTORE now supports: 1. Setting LRU/LFU 2. Absolute-time TTL Other related changes: 1. RDB loading will not override LRU bits when RDB file does not contain the LRU opcode. 2. RDB loading will not set LRU/LFU bits if the server's maxmemory-policy does not match.	2018-06-20 15:11:08 +07:00
Oran Agra	482785ac62	add malloc_usable_size for libc malloc this reduces the extra 8 bytes we save before each pointer. but more importantly maybe, it makes the valgrind runs to be more similiar to our normal runs. note: the change in malloc_stats struct in server.h is to eliminate an name conflict. structs that are not typedefed are resolved from a separate name space.	2018-06-19 18:18:23 +03:00
antirez	bd92389c2d	Refactor createObjectFromLongLong() to be suitable for value objects.	2018-06-18 16:55:16 +02:00
Salvatore Sanfilippo	94658303e9	Merge pull request #4758 from soloestoy/rdb-save-incremental-fsync Rdb save incremental fsync	2018-06-16 10:59:37 +02:00
Itamar Haber	e654b68d1f	Merge branch 'unstable' into module_command_help	2018-06-09 21:10:53 +03:00
Salvatore Sanfilippo	be899b824e	Merge pull request #4519 from soloestoy/zset-int-problem Zset int problem	2018-06-08 12:45:11 +02:00
Itamar Haber	76ad23d012	Adds MODULE HELP and implements addReplySubSyntaxError	2018-06-07 18:34:58 +03:00
antirez	19a438e2c0	Streams: use non static macro node limits. Also add the concept of size/items limit, instead of just having as limit the number of bytes.	2018-06-07 14:24:49 +02:00
antirez	56bbab238a	ZPOP: change sync ZPOP to have a count argument instead of N keys. Usually blocking operations make a lot of sense with multiple keys so that we can listen to multiple queues (or whatever the app models) with a single connection. However in the synchronous case it is more useful to be able to ask for N elements. This is a change that I also wanted to perform soon or later in the blocking list variant, but here it is more natural since there is no reply type difference.	2018-05-11 18:00:32 +02:00
antirez	6efb6c1e06	ZPOP: renaming to have explicit MIN/MAX score idea. This commit also adds a top comment about a subtle behavior of mixing blocking operations of different types in the same key.	2018-05-11 17:31:53 +02:00
Itamar Haber	49890c8ee9	Adds memory information about the script's cache to INFO Implementation notes: as INFO is "already broken", I didn't want to break it further. Instead of computing the server.lua_script dict size on every call, I'm keeping a running sum of the body's length and dict overheads. This implementation is naive as it does not take into consideration dict rehashing, but that inaccuracy pays off in speed ;) Demo time: ```bash $ redis-cli info memory \| grep "script" used_memory_scripts:96 used_memory_scripts_human:96B number_of_cached_scripts:0 $ redis-cli eval "" 0 ; redis-cli info memory \| grep "script" (nil) used_memory_scripts:120 used_memory_scripts_human:120B number_of_cached_scripts:1 $ redis-cli script flush ; redis-cli info memory \| grep "script" OK used_memory_scripts:96 used_memory_scripts_human:96B number_of_cached_scripts:0 $ redis-cli eval "return('Hello, Script Cache :)')" 0 ; redis-cli info memory \| grep "script" "Hello, Script Cache :)" used_memory_scripts:152 used_memory_scripts_human:152B number_of_cached_scripts:1 $ redis-cli eval "return redis.sha1hex(\"return('Hello, Script Cache :)')\")" 0 ; redis-cli info memory \| grep "script" "1be72729d43da5114929c1260a749073732dc822" used_memory_scripts:232 used_memory_scripts_human:232B number_of_cached_scripts:2 ✔ 19:03:54 redis [lua_scripts-in-info-memory L ✚…⚑] $ redis-cli evalsha 1be72729d43da5114929c1260a749073732dc822 0 "Hello, Script Cache :)" ```	2018-04-30 19:33:01 +03:00
Itamar Haber	438125b47c	Implements [B]Z[REV]POP and the respective unit tests An implementation of the [Ze POP Redis Module](https://github.com/itamarhaber/zpop) as core Redis commands. Fixes #1861.	2018-04-30 02:10:42 +03:00
antirez	e6b0e8d9ec	Streams: XTRIM command added.	2018-04-19 16:25:29 +02:00
antirez	aba76320d5	Streams: XDEL command.	2018-04-18 13:12:09 +02:00
antirez	de7de53e64	getMaxmemoryState() fixed and improved.	2018-04-11 12:48:26 +02:00
antirez	f97efe0cac	Modules: context flags now include OOM flag. Plus freeMemoryIfNeeded() refactoring to improve legibility. Please review this commit for sanity.	2018-04-09 17:44:30 +02:00
antirez	b2868c7b9c	Modules API: RM_GetRandomBytes() / GetRandomHexChars().	2018-04-05 13:24:22 +02:00
antirez	a97df1a6e1	Modules Cluster API: make node IDs pointers constant.	2018-03-30 13:16:07 +02:00
antirez	0701cad3de	Modules Cluster API: message bus implementation.	2018-03-29 15:13:31 +02:00
antirez	28d28ef3cf	AOF: enable RDB-preamble rewriting by default. There are too many advantages in doing this, RDB is faster to persist, more compact, much faster to load back. The main issues here are that the code is less tested because this was not the old default (so we are enabling it for the new 5.0 release), and that the AOF is no longer a trivially parsable format from now on. However the non-preamble mode will be supported in the future as well, if new data types will be added.	2018-03-25 11:43:30 +02:00
Salvatore Sanfilippo	da621783f0	Merge pull request #4691 from oranagra/active_defrag_v2 Active defrag v2	2018-03-22 09:16:32 +01:00
antirez	0b58ad301e	CG: Replication WIP 1: XREADGROUP and XCLAIM propagated as XCLAIM.	2018-03-19 18:02:19 +01:00
zhaozhao.zz	54cae05ea7	rdb: incremental fsync when redis saves rdb	2018-03-16 00:44:50 +08:00
antirez	0cf6b1e3ae	CG: XINFO CONSUMERS implemented.	2018-03-15 12:54:10 +01:00
antirez	b26f03bd69	CG: XCLAIM now updates the idle time of the message.	2018-03-15 12:54:10 +01:00
antirez	1bc31666da	CG: XPENDING without start/stop variant implemented.	2018-03-15 12:54:10 +01:00
antirez	388c69fe4e	CG: XACK implementation.	2018-03-15 12:54:10 +01:00
antirez	ccdae09046	CG: add & populate group+consumer in the blocking state.	2018-03-15 12:54:10 +01:00
antirez	58f0c000a5	CG: data structures design + XGROUP CREATE implementation.	2018-03-15 12:54:10 +01:00
antirez	432bf4770e	Cluster: ability to prevent slaves from failing over their masters. This commit, in some parts derived from PR #3041 which is no longer possible to merge (because the user deleted the original branch), implements the ability of slaves to have a special configuration preventing that they try to start a failover when the master is failing. There are multiple reasons for wanting this, and the feautre was requested in issue #3021 time ago. The differences between this patch and the original PR are the following: 1. The flag is saved/loaded on the nodes configuration. 2. The 'myself' node is now flag-aware, the flag is updated as needed when the configuration is changed via CONFIG SET. 3. The flag name uses NOFAILOVER instead of NO_FAILOVER to be consistent with existing NOADDR. 4. The redis.conf documentation was rewritten. Thanks to @deep011 for the original patch.	2018-03-14 14:01:38 +01:00
Oran Agra	806736cdf9	Adding real allocator fragmentation to INFO and MEMORY command + active defrag test other fixes / improvements: - LUA script memory isn't taken from zmalloc (taken from libc malloc) so it can cause high fragmentation ratio to be displayed (which is false) - there was a problem with "fragmentation" info being calculated from RSS and used_memory sampled at different times (now sampling them together) other details: - adding a few more allocator info fields to INFO and MEMORY commands - improve defrag test to measure defrag latency of big keys - increasing the accuracy of the defrag test (by looking at real grag info) this way we can use an even lower threshold and still avoid false positives - keep the old (total) "fragmentation" field unchanged, but add new ones for spcific things - add these the MEMORY DOCTOR command - deduct LUA memory from the rss in case of non jemalloc allocator (one for which we don't "allocator active/used") - reduce sampling rate of the rss and allocator info	2018-03-12 15:08:52 +02:00
Oran Agra	be1b4aa9aa	active defrag v2 - big keys are not defragged in one go from within the dict scan instead they are scanned in parts after the main dict hash bucket is done. - add latency monitor sample for defrag - change default active-defrag-cycle-min to induce lower latency - make active defrag start a new scan right away if needed, so it's easier (for the test suite) to detect when it's done - make active defrag quick the current cycle after each db / big key - defrag some non key long term global allocations - some refactoring for smaller functions and more reusable code - during dict rehashing, one scan iteration of the dict, can end up scanning one bucket in the smaller dict and many many buckets in the larger dict. so waiting for 16 scan iterations before checking the time, may be much too long.	2018-03-12 15:07:43 +02:00
伯成	dfb12f0628	Boost up performance for redis PUB-SUB patterns matching If lots of clients PSUBSCRIBE to same patterns, multiple pattens matching will take place. This commit change it into just one single pattern matching by using a `dict *` to store the unique pattern and which clients subscribe to it.	2018-03-01 11:46:56 +08:00
antirez	ffde73c57d	Track number of logically expired keys still in memory. This commit adds two new fields in the INFO output, stats section: expired_stale_perc:0.34 expired_time_cap_reached_count:58 The first field is an estimate of the number of keys that are yet in memory but are already logically expired. They reason why those keys are yet not reclaimed is because the active expire cycle can't spend more time on the process of reclaiming the keys, and at the same time nobody is accessing such keys. However as the active expire cycle runs, while it will eventually have to return to the caller, because of time limit or because there are less than 25% of keys logically expired in each given database, it collects the stats in order to populate this INFO field. Note that expired_stale_perc is a running average, where the current sample accounts for 5% and the history for 95%, so you'll see it changing smoothly over time. The other field, expired_time_cap_reached_count, counts the number of times the expire cycle had to stop, even if still it was finding a sizeable number of keys yet to expire, because of the time limit. This allows people handling operations to understand if the Redis server, during mass-expiration events, is able to collect keys fast enough usually. It is normal for this field to increment during mass expires, but normally it should very rarely increment. When instead it constantly increments, it means that the current workloads is using a very important percentage of CPU time to expire keys. This feature was created thanks to the hints of Rashmi Ramesh and Bart Robinson from Twitter. In private email exchanges, they noted how it was important to improve the observability of this parameter in the Redis server. Actually in big deployments, the amount of keys that are yet to expire in each server, even if they are logically expired, may account for a very big amount of wasted memory.	2018-02-19 11:12:49 +01:00
Dvir Volk	3aab12414f	Remove the NOTIFY_MODULE flag and simplify the module notification flow if there aren't subscribers	2018-02-14 21:40:10 +02:00
Dvir Volk	2136035e47	finished implementation of notifications. Tests unfinished	2018-02-14 21:38:58 +02:00
antirez	8075572207	New config options about protocol prefixed with "proto". Related to #4568.	2018-01-11 11:27:41 +01:00
Oran Agra	b509a14c3e	Add config options for max-bulk-len and max-querybuf-len mainly to support RESTORE of large keys	2017-12-29 12:43:48 +02:00
zhaozhao.zz	109ee497be	zset: change the span of zskiplistNode to unsigned long	2017-12-08 16:09:27 +08:00
zhaozhao.zz	e8901b2fe4	zset: fix the int problem	2017-12-08 15:37:08 +08:00
Itamar Haber	8b51121998	Merge remote-tracking branch 'upstream/unstable' into help_subcommands	2017-12-05 18:14:59 +02:00
antirez	62a4b817c6	add linkClient(): adds the client and caches the list node. We have this operation in two places: when caching the master and when linking a new client after the client creation. By having an API for this we avoid incurring in errors when modifying one of the two places forgetting the other. The function is also a good place where to document why we cache the linked list node. Related to #4497 and #4210.	2017-12-05 16:02:03 +01:00
Salvatore Sanfilippo	03cfc8bf3a	Merge pull request #4497 from soloestoy/optimize-unlink-client networking: optimize unlinkClient() in freeClient()	2017-12-05 15:51:15 +01:00
antirez	60d26acfc8	Refactoring: improve luaCreateFunction() API. The function in its initial form, and after the fixes for the PSYNC2 bugs, required code duplication in multiple spots. This commit modifies it in order to always compute the script name independently, and to return the SDS of the SHA of the body: this way it can be used in all the places, including for SCRIPT LOAD, without duplicating the code to create the Lua function name. Note that this requires to re-compute the body SHA1 in the case of EVAL seeing a script for the first time, but this should not change scripting performance in any way because new scripts definition is a rare event happening the first time a script is seen, and the SHA1 computation is anyway not a very slow process against the typical Redis script and compared to the actua Lua byte compiling of the body. Note that the function used to assert() if a duplicated script was loaded, however actually now two times over three, we want the function to handle duplicated scripts just fine: this happens in SCRIPT LOAD and in RDB AUX "lua" loading. Moreover the assert was not defending against some obvious failure mode, so now the function always tests against already defined functions at start.	2017-12-04 11:25:20 +01:00
antirez	65a9740fa8	Fix loading of RDB files lua AUX fields when the script is defined. In the case of slaves loading the RDB from master, or in other similar cases, the script is already defined, and the function registering the script should not fail in the assert() call.	2017-12-01 16:01:10 +01:00
antirez	9bb18e5438	Streams: XRANGE REV option -> XREVRANGE command.	2017-12-01 10:24:25 +01:00
antirez	01ea018c40	Streams: export iteration API.	2017-12-01 10:24:24 +01:00
antirez	19b06935d5	Streams: fix XADD API and keyspace notifications. XADD was suboptimal in the first incarnation of the command, not being able to accept an ID (very useufl for replication), nor options for having capped streams. The keyspace notification for streams was not implemented.	2017-12-01 10:24:24 +01:00
antirez	6468cb2e82	Streams: fix XREAD ready-key signaling. With lists we need to signal only on key creation, but streams can provide data to clients listening at every new item added. To make this slightly more efficient we now track different classes of blocked clients to avoid signaling keys when there is nobody listening. A typical case is when the stream is used as a time series DB and accessed only by range with XRANGE.	2017-12-01 10:24:24 +01:00
antirez	2cacdcd6f8	Streams: XREAD related code to serve blocked clients.	2017-12-01 10:24:24 +01:00
antirez	110041825c	Streams: XREAD get-keys method.	2017-12-01 10:24:24 +01:00
antirez	4086dff477	Streams: augment client.bpop with XREAD specific fields.	2017-12-01 10:24:24 +01:00
antirez	f80dfbf464	Streams: more internal preparation for blocking XREAD.	2017-12-01 10:24:24 +01:00
antirez	4a377cecd8	Streams: initial work to use blocking lists logic for streams XREAD.	2017-12-01 10:24:24 +01:00
antirez	439120c620	Streams: implement stream object release.	2017-12-01 10:24:24 +01:00
antirez	ec9bbe96bf	Streams: XLEN command.	2017-12-01 10:24:24 +01:00
antirez	100d43c1ac	Streams: assign value of 6 to OBJ_STREAM + some refactoring.	2017-12-01 10:24:24 +01:00
antirez	79866a6361	Streams: 12 commits squashed into the initial Streams implementation.	2017-12-01 10:24:24 +01:00
antirez	f11a7585a8	PSYNC2: Save Lua scripts state into RDB file. This is currently needed in order to fix #4483, but this can be useful in other contexts, so maybe later we may want to remove the conditionals and always save/load scripts. Note that we are using the "lua" AUX field here, in order to guarantee backward compatibility of the RDB file. The unknown AUX fields must be discarded by past versions of Redis.	2017-11-30 18:37:52 +01:00
zhaozhao.zz	43be967690	networking: optimize unlinkClient() in freeClient()	2017-11-30 18:11:05 +08:00
Itamar Haber	59d52f7fab	Standardizes the 'help' subcommand This adds a new `addReplyHelp` helper that's used by commands when returning a help text. The following commands have been touched: DEBUG, OBJECT, COMMAND, PUBSUB, SCRIPT and SLOWLOG. WIP Fix entry command table entry for OBJECT for HELP option. After #4472 the command may have just 2 arguments. Improve OBJECT HELP descriptions. See #4472. WIP 2 WIP 3	2017-11-28 21:15:45 +02:00
zhaozhao.zz	583c314725	LFU: do some changes about LFU to find hotkeys Firstly, use access time to replace the decreas time of LFU. For function LFUDecrAndReturn, it should only try to get decremented counter, not update LFU fields, we will update it in an explicit way. And we will times halve the counter according to the times of elapsed time than server.lfu_decay_time. Everytime a key is accessed, we should update the LFU including update access time, and increment the counter after call function LFUDecrAndReturn. If a key is overwritten, the LFU should be also updated. Then we can use `OBJECT freq` command to get a key's frequence, and LFUDecrAndReturn should be called in `OBJECT freq` command in case of the key has not been accessed for a long time, because we update the access time only when the key is read or overwritten.	2017-11-27 18:39:22 +01:00
zhaozhao.zz	53cea97204	LFU: change lfu* parameters to int	2017-11-27 18:38:55 +01:00
antirez	e74f0aa6d1	Fix replication of SLAVEOF inside transaction. In Redis 4.0 replication, with the introduction of PSYNC2, masters and slaves replicate commands to cascading slaves and to the replication backlog itself in a different way compared to the past. Masters actually replicate the effects of client commands. Slaves just propagate what they receive from masters. This mechanism can cause problems when the configuration of an instance is changed from master to slave inside a transaction. For instance we could send to a master instance the following sequence: MULTI SLAVEOF 127.0.0.1 0 EXEC SLAVEOF NO ONE Before the fixes in this commit, the MULTI command used to be propagated into the replication backlog, however after the SLAVEOF command the instance is a slave, so the EXEC implementation failed to also propagate the EXEC command. When the slaves of the above instance reconnected, they were incrementally synchronized just sending a "MULTI". This put the master client (in the slaves) into MULTI state, breaking the replication. Notably even Redis Sentinel uses the above approach in order to guarantee that configuration changes are always performed together with rewrites of the configuration and with clients disconnection. Sentiel does: MULTI SLAVEOF ... CONFIG REWRITE CLIENT KILL TYPE normal EXEC So this was a really problematic issue. However even with the fix in this commit, that will add the final EXEC to the replication stream in case the instance was switched from master to slave during the transaction, the result would be to increment the slave replication offset, so a successive reconnection with the new master, will not permit a successful partial resynchronization: no way the new master can provide us with the backlog needed, we incremented our offset to a value that the new master cannot have. However the EXEC implementation waits to emit the MULTI, so that if the commands inside the transaction actually do not need to be replicated, no commands propagation happens at all. From multi.c: if (!must_propagate && !(c->cmd->flags & (CMD_READONLY\|CMD_ADMIN))) { execCommandPropagateMulti(c); must_propagate = 1; } The above code is already modified by this commit you are reading. Now also ADMIN commands do not trigger the emission of MULTI. It is actually not clear why we do not just check for CMD_WRITE... Probably I wrote it this way in order to make the code more reliable: better to over-emit MULTI than not emitting it in time. So this commit should indeed fix issue #3836 (verified), however it looks like some reconsideration of this code path is needed in the long term. BONUS POINT: The reverse bug. Even in a read only slave "B", in a replication setup like: A -> B -> C There are commands without the READONLY nor the ADMIN flag, that are also not flagged as WRITE commands. An example is just the PING command. So if we send B the following sequence: MULTI PING SLAVEOF NO ONE EXEC The result will be the reverse bug, where only EXEC is emitted, but not the previous MULTI. However this apparently does not create problems in practice but it is yet another acknowledge of the fact some work is needed here in order to make this code path less surprising. Note that there are many different approaches we could follow. For instance MULTI/EXEC blocks containing administrative commands may be allowed ONLY if all the commands are administrative ones, otherwise they could be denined. When allowed, the commands could simply never be replicated at all.	2017-07-12 11:07:28 +02:00
antirez	fc7ecd8d35	AOF check utility: ability to check files with RDB preamble.	2017-07-10 13:38:23 +02:00
antirez	51ffd062d3	Modules: DEBUG DIGEST interface.	2017-07-06 11:04:46 +02:00
antirez	f8547e53f0	Added GEORADIUS(BYMEMBER)_RO variants for read-only operations. Issue #4084 shows how for a design error, GEORADIUS is a write command because of the STORE option. Because of this it does not work on readonly slaves, gets redirected to masters in Redis Cluster even when the connection is in READONLY mode and so forth. To break backward compatibility at this stage, with Redis 4.0 to be in advanced RC state, is problematic for the user base. The API can be fixed into the unstable branch soon if we'll decide to do so in order to be more consistent, and reease Redis 5.0 with this incompatibility in the future. This is still unclear. However, the ability to scale GEO queries in slaves easily is too important so this commit adds two read-only variants to the GEORADIUS and GEORADIUSBYMEMBER command: GEORADIUS_RO and GEORADIUSBYMEMBER_RO. The commands are exactly as the original commands, but they do not accept the STORE and STOREDIST options.	2017-06-30 10:03:37 +02:00
antirez	365dd037dc	RDB modules values serialization format version 2. The original RDB serialization format was not parsable without the module loaded, becuase the structure was managed only by the module itself. Moreover RDB is a streaming protocol in the sense that it is both produce di an append-only fashion, and is also sometimes directly sent to the socket (in the case of diskless replication). The fact that modules values cannot be parsed without the relevant module loaded is a problem in many ways: RDB checking tools must have loaded modules even for doing things not involving the value at all, like splitting an RDB into N RDBs by key or alike, or just checking the RDB for sanity. In theory module values could be just a blob of data with a prefixed length in order for us to be able to skip it. However prefixing the values with a length would mean one of the following: 1. To be able to write some data at a previous offset. This breaks stremaing. 2. To bufferize values before outputting them. This breaks performances. 3. To have some chunked RDB output format. This breaks simplicity. Moreover, the above solution, still makes module values a totally opaque matter, with the fowllowing problems: 1. The RDB check tool can just skip the value without being able to at least check the general structure. For datasets composed mostly of modules values this means to just check the outer level of the RDB not actually doing any checko on most of the data itself. 2. It is not possible to do any recovering or processing of data for which a module no longer exists in the future, or is unknown. So this commit implements a different solution. The modules RDB serialization API is composed if well defined calls to store integers, floats, doubles or strings. After this commit, the parts generated by the module API have a one-byte prefix for each of the above emitted parts, and there is a final EOF byte as well. So even if we don't know exactly how to interpret a module value, we can always parse it at an high level, check the overall structure, understand the types used to store the information, and easily skip the whole value. The change is backward compatible: older RDB files can be still loaded since the new encoding has a new RDB type: MODULE_2 (of value 7). The commit also implements the ability to check RDB files for sanity taking advantage of the new feature.	2017-06-27 13:19:16 +02:00
xuzhou	530fcf8687	Fix set with ex/px option when propagated to aof	2017-06-16 17:51:38 +08:00
Qu Chen	4740424049	Implement getKeys procedure for georadius and georadiusbymember commands.	2017-06-14 18:15:48 +02:00
antirez	1f598fc2bb	Modules TSC: use atomic var for server.unixtime. This avoids Helgrind complaining, but we are actually not using atomicGet() to get the unixtime value for now: too many places where it is used and given tha time_t is word-sized it should be safe in all the archs we support as it is. On the other hand, Helgrind, when Redis is compiled with "make helgrind" in order to force the __sync macros, will detect the write in updateCachedTime() as a read (because atomic functions are used) and will not complain about races. This commit also includes minor refactoring of mutex initializations and a "helgrind" target in the Makefile.	2017-05-10 10:04:16 +02:00
antirez	9390c384b8	Modules TSC: Add mutex for server.lruclock. Only useful for when no atomic builtins are available.	2017-05-09 16:32:49 +02:00
antirez	ece658713b	Modules TSC: Improve inter-thread synchronization. More work to do with server.unixtime and similar. Need to write Helgrind suppression file in order to suppress the valse positives.	2017-05-09 11:57:09 +02:00
antirez	3fcf959e60	Modules TSC: Release the GIL for all the time we are blocked. Instead of giving the module background operations just a small time to run in the beforeSleep() function, we can have the lock released for all the time we are blocked in the multiplexing syscall.	2017-05-03 11:26:21 +02:00
antirez	59b06b14c9	Modules TSC: GIL and cooperative multi tasking setup.	2017-04-28 18:41:10 +02:00
antirez	22be435efe	Fix PSYNC2 incomplete command bug as described in #3899 . This bug was discovered by @kevinmcgehee and constituted a major hidden bug in the PSYNC2 implementation, caused by the propagation from the master of incomplete commands to slaves. The bug had several results: 1. Borrowing from Kevin text in the issue: "Given that slaves blindly copy over their master's input into their own replication backlog over successive read syscalls, it's possible that with large commands or small TCP buffers, partial commands are present in this buffer. If the master were to fail before successfully propagating the entire command to a slave, the slaves will never execute the partial command (since the client is invalidated) but will copy it to replication backlog which may relay those invalid bytes to its slaves on PSYNC2, corrupting the backlog and possibly other valid commands that follow the failover. Simple command boundaries aren't sufficient to capture this, either, because in the case of a MULTI/EXEC block, if the master successfully propagates a subset of the commands but not the EXEC, then the transaction in the backlog becomes corrupt and could corrupt other slaves that consume this data." 2. As identified by @yangsiran later, there is another effect of the bug. For the same mechanism of the first problem, a slave having another slave, could receive a full resynchronization request with an already half-applied command in the backlog. Once the RDB is ready, it will be sent to the slave, and the replication will continue sending to the sub-slave the other half of the command, which is not valid. The fix, designed by @yangsiran and @antirez, and implemented by @antirez, uses a secondary buffer in order to feed the sub-masters and update the replication backlog and offsets, only when a given part of the query buffer is actually applied to the state of the instance, that is, when the command gets processed and the command is not pending in the Redis transaction buffer because of CLIENT_MULTI state. Given that now the backlog and offsets representation are in agreement with the actual processed commands, both issue 1 and 2 should no longer be possible. Thanks to @kevinmcgehee, @yangsiran and @oranagra for their work in identifying and designing a fix for this problem.	2017-04-19 10:25:45 +02:00
antirez	ffefc9f92d	Fix modules blocking commands awake delay. If a thread unblocks a client blocked in a module command, by using the RedisMdoule_UnblockClient() API, the event loop may not be awaken until the next timeout of the multiplexing API or the next unrelated I/O operation on other clients. We actually want the client to be served ASAP, so a mechanism is needed in order for the unblocking API to inform Redis that there is a client to serve ASAP. This commit fixes the issue using the old trick of the pipe: when a client needs to be unblocked, a byte is written in a pipe. When we run the list of clients blocked in modules, we consume all the bytes written in the pipe. Writes and reads are performed inside the context of the mutex, so no race is possible in which we consume the bytes that are actually related to an awake request for a client that should still be put into the list of clients to unblock. It was verified that after the fix the server handles the blocked clients with the expected short delay. Thanks to @dvirsky for understanding there was such a problem and reporting it.	2017-04-10 09:33:21 +02:00
antirez	1409c545da	Cluster: hash slots tracking using a radix tree.	2017-03-27 16:37:22 +02:00
antirez	adeed29a99	Use SipHash hash function to mitigate HashDos attempts. This change attempts to switch to an hash function which mitigates the effects of the HashDoS attack (denial of service attack trying to force data structures to worst case behavior) while at the same time providing Redis with an hash function that does not expect the input data to be word aligned, a condition no longer true now that sds.c strings have a varialbe length header. Note that it is possible sometimes that even using an hash function for which collisions cannot be generated without knowing the seed, special implementation details or the exposure of the seed in an indirect way (for example the ability to add elements to a Set and check the return in which Redis returns them with SMEMBERS) may make the attacker's life simpler in the process of trying to guess the correct seed, however the next step would be to switch to a log(N) data structure when too many items in a single bucket are detected: this seems like an overkill in the case of Redis. SPEED REGRESION TESTS: In order to verify that switching from MurmurHash to SipHash had no impact on speed, a set of benchmarks involving fast insertion of 5 million of keys were performed. The result shows Redis with SipHash in high pipelining conditions to be about 4% slower compared to using the previous hash function. However this could partially be related to the fact that the current implementation does not attempt to hash whole words at a time but reads single bytes, in order to have an output which is endian-netural and at the same time working on systems where unaligned memory accesses are a problem. Further X86 specific optimizations should be tested, the function may easily get at the same level of MurMurHash2 if a few optimizations are performed.	2017-02-20 17:29:17 +01:00
antirez	53b8bf2c89	serverPanic(): allow printf() alike formatting. This is of great interest because allows us to print debugging informations that could be of useful when debugging, like in the following example: serverPanic("Unexpected encoding for object %d, %d", obj->type, obj->encoding);	2017-01-18 17:05:10 +01:00
antirez	636c693f44	Use const in modules types mem_usage method. As suggested by @itamarhaber.	2017-01-12 12:47:46 +01:00
antirez	6ad34a4b78	Defrag: not enabled by default. Error on CONFIG SET if not available.	2017-01-11 15:43:08 +01:00
oranagra	7aa9e6d2ae	active memory defragmentation	2016-12-30 03:37:52 +02:00
antirez	06bfeb482d	Only show Redis logo if logging to stdout / TTY. You can still force the logo in the normal logs. For motivations, check issue #3112. For me the reason is that actually the logo is nice to have in interactive sessions, but inside the logs kinda loses its usefulness, but for the ability of users to recognize restarts easily: for this reason the new startup sequence shows a one liner ASCII "wave" so that there is still a bit of visual clue. Startup logging was modified in order to log events in more obvious ways, and to log more events. Also certain important informations are now more easy to parse/grep since they are printed in field=value style. The option --always-show-logo in redis.conf was added, defaulting to no.	2016-12-19 16:41:47 +01:00
antirez	87538cb7fe	Switch PFCOUNT to LogLog-Beta algorithm. The new algorithm provides the same speed with a smaller error for cardinalities in the range 0-100k. Before switching, the new and old algorithm behavior was studied in details in the context of issue #3677. You can find a few graphs and motivations there.	2016-12-16 11:07:30 +01:00
Harish Murthy	c55e3fbae5	LogLog-Beta Algorithm support within HLL Config option to use LogLog-Beta Algorithm for Cardinality	2016-12-16 11:07:30 +01:00
antirez	ac61f90625	DEBUG: new "ziplist" subcommand added. Dumps a ziplist on stdout. The commit improves ziplistRepr() and adds a new debugging subcommand so that we can trigger the dump directly from the Redis API. This command capability was used while investigating issue #3684.	2016-12-16 09:02:50 +01:00
antirez	b6f871cf42	Writable slaves expires: fix leak in key tracking. We need to use a dictionary type that frees the key, since we copy the keys in the dictionary we use to track expires created in the slave side.	2016-12-13 16:27:13 +01:00
antirez	d1adc85aa6	INFO: show num of slave-expires keys tracked.	2016-12-13 16:02:29 +01:00
antirez	04542cff92	Replication: fix the infamous key leakage of writable slaves + EXPIRE. BACKGROUND AND USE CASEj Redis slaves are normally write only, however the supprot a "writable" mode which is very handy when scaling reads on slaves, that actually need write operations in order to access data. For instance imagine having slaves replicating certain Sets keys from the master. When accessing the data on the slave, we want to peform intersections between such Sets values. However we don't want to intersect each time: to cache the intersection for some time often is a good idea. To do so, it is possible to setup a slave as a writable slave, and perform the intersection on the slave side, perhaps setting a TTL on the resulting key so that it will expire after some time. THE BUG Problem: in order to have a consistent replication, expiring of keys in Redis replication is up to the master, that synthesize DEL operations to send in the replication stream. However slaves logically expire keys by hiding them from read attempts from clients so that if the master did not promptly sent a DEL, the client still see logically expired keys as non existing. Because slaves don't actively expire keys by actually evicting them but just masking from the POV of read operations, if a key is created in a writable slave, and an expire is set, the key will be leaked forever: 1. No DEL will be received from the master, which does not know about such a key at all. 2. No eviction will be performed by the slave, since it needs to disable eviction because it's up to masters, otherwise consistency of data is lost. THE FIX In order to fix the problem, the slave should be able to tag keys that were created in the slave side and have an expire set in some way. My solution involved using an unique additional dictionary created by the writable slave only if needed. The dictionary is obviously keyed by the key name that we need to track: all the keys that are set with an expire directly by a client writing to the slave are tracked. The value in the dictionary is a bitmap of all the DBs where such a key name need to be tracked, so that we can use a single dictionary to track keys in all the DBs used by the slave (actually this limits the solution to the first 64 DBs, but the default with Redis is to use 16 DBs). This solution allows to pay both a small complexity and CPU penalty, which is zero when the feature is not used, actually. The slave-side eviction is encapsulated in code which is not coupled with the rest of the Redis core, if not for the hook to track the keys. TODO I'm doing the first smoke tests to see if the feature works as expected: so far so good. Unit tests should be added before merging into the 4.0 branch.	2016-12-13 10:59:54 +01:00
antirez	71e8d15e49	Modules: change type registration API to use a struct of methods.	2016-11-30 11:14:01 +01:00
antirez	28c96d73b2	PSYNC2: Save replication ID/offset on RDB file. This means that stopping a slave and restarting it will still make it able to PSYNC with the master. Moreover the master itself will retain its ID/offset, in case it gets turned into a slave, or if a slave will try to PSYNC with it with an exactly updated offset (otherwise there is no backlog). This change was possible thanks to PSYNC v2 that makes saving the current replication state much simpler.	2016-11-10 12:35:29 +01:00
antirez	2669fb8364	PSYNC2: different improvements to Redis replication. The gist of the changes is that now, partial resynchronizations between slaves and masters (without the need of a full resync with RDB transfer and so forth), work in a number of cases when it was impossible in the past. For instance: 1. When a slave is promoted to mastrer, the slaves of the old master can partially resynchronize with the new master. 2. Chained slalves (slaves of slaves) can be moved to replicate to other slaves or the master itsef, without requiring a full resync. 3. The master itself, after being turned into a slave, is able to partially resynchronize with the new master, when it joins replication again. In order to obtain this, the following main changes were operated: * Slaves also take a replication backlog, not just masters. * Same stream replication for all the slaves and sub slaves. The replication stream is identical from the top level master to its slaves and is also the same from the slaves to their sub-slaves and so forth. This means that if a slave is later promoted to master, it has the same replication backlong, and can partially resynchronize with its slaves (that were previously slaves of the old master). * A given replication history is no longer identified by the `runid` of a Redis node. There is instead a `replication ID` which changes every time the instance has a new history no longer coherent with the past one. So, for example, slaves publish the same replication history of their master, however when they are turned into masters, they publish a new replication ID, but still remember the old ID, so that they are able to partially resynchronize with slaves of the old master (up to a given offset). * The replication protocol was slightly modified so that a new extended +CONTINUE reply from the master is able to inform the slave of a replication ID change. * REPLCONF CAPA is used in order to notify masters that a slave is able to understand the new +CONTINUE reply. * The RDB file was extended with an auxiliary field that is able to select a given DB after loading in the slave, so that the slave can continue receiving the replication stream from the point it was disconnected without requiring the master to insert "SELECT" statements. This is useful in order to guarantee the "same stream" property, because the slave must be able to accumulate an identical backlog. * Slave pings to sub-slaves are now sent in a special form, when the top-level master is disconnected, in order to don't interfer with the replication stream. We just use out of band "\n" bytes as in other parts of the Redis protocol. An old design document is available here: https://gist.github.com/antirez/ae068f95c0d084891305 However the implementation is not identical to the description because during the work to implement it, different changes were needed in order to make things working well.	2016-11-09 15:37:15 +01:00
antirez	c7a4e694ad	SWAPDB command. This new command swaps two Redis databases, so that immediately all the clients connected to a given DB will see the data of the other DB, and the other way around. Example: SWAPDB 0 1 This will swap DB 0 with DB 1. All the clients connected with DB 0 will immediately see the new data, exactly like all the clients connected with DB 1 will see the data that was formerly of DB 0. MOTIVATION AND HISTORY --- The command was recently demanded by Pedro Melo, but was suggested in the past multiple times, and always refused by me. The reason why it was asked: Imagine you have clients operating in DB 0. At the same time, you create a new version of the dataset in DB 1. When the new version of the dataset is available, you immediately want to swap the two views, so that the clients will transparently use the new version of the data. At the same time you'll likely destroy the DB 1 dataset (that contains the old data) and start to build a new version, to repeat the process. This is an interesting pattern, but the reason why I always opposed to implement this, was that FLUSHDB was a blocking command in Redis before Redis 4.0 improvements. Now we have FLUSHDB ASYNC that releases the old data in O(1) from the point of view of the client, to reclaim memory incrementally in a different thread. At this point, the pattern can really be supported without latency spikes, so I'm providing this implementation for the users to comment. In case a very compelling argument will be made against this new command it may be removed. BEHAVIOR WITH BLOCKING OPERATIONS --- If a client is blocking for a list in a given DB, after the swap it will still be blocked in the same DB ID, since this is the most logical thing to do: if I was blocked for a list push to list "foo", even after the swap I want still a LPUSH to reach the key "foo" in the same DB in order to unblock. However an interesting thing happens when a client is, for instance, blocked waiting for new elements in list "foo" of DB 0. Then the DB 0 and 1 are swapped with SWAPDB. However the DB 1 happened to have a list called "foo" containing elements. When this happens, this implementation can correctly unblock the client. It is possible that there are subtle corner cases that are not covered in the implementation, but since the command is self-contained from the POV of the implementation and the Redis core, it cannot cause anything bad if not used. Tests and documentation are yet to be provided.	2016-10-14 15:28:04 +02:00
antirez	8fadfe52a2	Module: API to block clients with threading support. Just a draft to align the main ideas, never executed code. Compiles.	2016-10-07 11:55:35 +02:00
antirez	799208de85	Fix name of mispelled function.	2016-10-06 17:10:47 +02:00
antirez	152c1b6802	Module: Ability to get context from IO context. It was noted by @dvirsky that it is not possible to use string functions when writing the AOF file. This sometimes is critical since the command rewriting may need to be built in the context of the AOF callback, and without access to the context, and the limited types that the AOF production functions will accept, this can be an issue. Moreover there are other needs that we can't anticipate regarding the ability to use Redis Modules APIs using the context in order to build representations to emit AOF / RDB. Because of this a new API was added that allows the user to get a temporary context from the IO context. The context is auto released if obtained when the RDB / AOF callback returns. Calling multiple time the function to get the context, always returns the same one, since it is invalid to have more than a single context.	2016-10-06 17:09:26 +02:00
antirez	e565632e59	Child -> Parent pipe for COW info transferring.	2016-09-19 13:45:20 +02:00
antirez	44e714a59c	MEMORY DOCTOR initial implementation.	2016-09-16 16:36:53 +02:00
antirez	d9325ac6c8	Provide percentage of memory peak used info.	2016-09-16 10:43:19 +02:00
oranagra	309c2bcd1b	add zmalloc used mem to DEBUG SDSLEN	2016-09-16 10:29:27 +02:00
antirez	e9629e148b	MEMORY command: HELP + dataset percentage (like in INFO).	2016-09-15 17:33:16 +02:00
antirez	bf2624ea99	C struct memoh renamed redisMemOverhead. API prototypes added.	2016-09-15 09:44:07 +02:00
antirez	8c84c962cf	MEMORY OVERHEAD implemented (using Oran Agra initial implementation). This code was extracted from @oranagra PR #3223 and modified in order to provide only certain amounts of information compared to the original code. It was also moved from DEBUG to the newly introduced MEMORY command. Thanks to Oran for the implementation and the PR. It implements detailed memory usage stats that can be useful in both provisioning and troubleshooting memory usage in Redis.	2016-09-13 17:39:25 +02:00
antirez	89dec6921d	objectComputeSize(): estimate collections sampling N elements. For most tasks, we need the memory estimation to be O(1) by default. This commit also implements an initial MEMORY command. Note that objectComputeSize() takes the number of samples to check as argument, so MEMORY should be able to get the sample size as option to make precision VS CPU tradeoff tunable. Related to: PR #3223.	2016-09-13 10:28:23 +02:00
antirez	feda52381d	RDB AOF preamble: WIP 2.	2016-08-09 16:41:40 +02:00
antirez	4426cb11e2	RDB AOF preamble: WIP 1.	2016-08-09 11:07:32 +02:00
antirez	a81a92ca2c	Security: Cross Protocol Scripting protection. This is an attempt at mitigating problems due to cross protocol scripting, an attack targeting services using line oriented protocols like Redis that can accept HTTP requests as valid protocol, by discarding the invalid parts and accepting the payloads sent, for example, via a POST request. For this to be effective, when we detect POST and Host: and terminate the connection asynchronously, the networking code was modified in order to never process further input. It was later verified that in a pipelined request containing a POST command, the successive commands are not executed.	2016-08-03 11:12:32 +02:00
antirez	55385f99de	Ability of slave to announce arbitrary ip/port to master. This feature is useful, especially in deployments using Sentinel in order to setup Redis HA, where the slave is executed with NAT or port forwarding, so that the auto-detected port/ip addresses, as listed in the "INFO replication" output of the master, or as provided by the "ROLE" command, don't match the real addresses at which the slave is reachable for connections.	2016-07-27 17:32:15 +02:00
antirez	0a628e5102	Avoid simultaneous RDB and AOF child process. This patch, written in collaboration with Oran Agra (@oranagra) is a companion to `780a8b1`. Together the two patches should avoid that the AOF and RDB saving processes can be spawned at the same time. Previously conditions that could lead to two saving processes at the same time were: 1. When AOF is enabled via CONFIG SET and an RDB saving process is already active. 2. When the SYNC command decides to start an RDB saving process ASAP in order to serve a new slave that cannot partially resynchronize (but only if we have a disk target for replication, for diskless replication there is not such a problem). Condition "1" is not very severe but "2" can happen often and is definitely good at degrading Redis performances in an unexpected way. The two commits have the effect of always spawning RDB savings for replication in replicationCron() instead of attempting to start an RDB save synchronously. Moreover when a BGSAVE or AOF rewrite must be performed, they are instead just postponed using flags that will try to perform such operations ASAP. Finally the BGSAVE command was modified in order to accept a SCHEDULE option so that if an AOF rewrite is in progress, when this option is given, the command no longer returns an error, but instead schedules an RDB rewrite operation for when it will be possible to start it.	2016-07-21 18:35:01 +02:00
antirez	2d5eb1f1a0	Volatile-ttl eviction policy implemented in terms of the pool. Precision of the eviction improved sensibly. Also this allows us to have a single code path for most eviction types.	2016-07-20 19:54:12 +02:00
antirez	6854c7b9ee	LFU: make counter log factor and decay time configurable.	2016-07-20 15:00:35 +02:00
antirez	5d07984c5d	LFU: Redis object level implementation. Implementation of LFU maxmemory policy for anything related to Redis objects. Still no actual eviction implemented.	2016-07-15 12:12:58 +02:00
antirez	e423f76e75	LRU: Make cross-database choices for eviction. The LRU eviction code used to make local choices: for each DB visited it selected the best key to evict. This was repeated for each DB. However this means that there could be DBs with very frequently accessed keys that are targeted by the LRU algorithm while there were other DBs with many better candidates to expire. This commit attempts to fix this problem for the LRU policy. However the TTL policy is still not fixed by this commit. The TTL policy will be fixed in a successive commit. This is an initial (partial because of TTL policy) fix for issue #2647.	2016-07-13 13:12:30 +02:00
antirez	965905c9f2	Move the struct evictionPoolEntry() into only file using it. Local scope is always better when possible.	2016-07-12 12:22:38 +02:00

... 3 4 5 6 7 ...

535 Commits