1. We no longer use a fake client but just rewriting.
2. We group all the inserts into a single ZADD dispatch (big speed win).
3. As a side effect of the correct implementation, replication works.
4. The return value of the command is now correct.
This commit simplifies the implementation in a few ways:
1. zsetScore implementation improved a bit and moved into t_zset.c where
is now also used to implement the ZSCORE command.
2. Range extraction from the sorted set remains a separated
implementation from the one in t_zset.c, but was hyper-specialized in
order to avoid accumulating results into a list and remove the ones
outside the radius.
3. A new type is introduced: geoArray, which can accumulate geoPoint
structures in a vector with power of two expansion policy. This is
useful since we have to call qsort() against it before returning the
result to the user.
4. As a result of 1, 2, 3, the two files zset.c and zset.h are now
removed, including the function to merge two lists (now handled with
functions that can add elements to existing geoArray arrays) and
the machinery used in order to pass zset results.
5. geoPoint structure simplified because of the general code structure
simplification, so we no longer need to take references to objects.
6. Not counting the JSON removal the refactoring removes 200 lines of
code for the same functionalities, with a simpler to read
implementation.
7. GEORADIUS is now 2.5 times faster testing with 10k elements and a
radius resulting in 124 elements returned. However this is mostly a
side effect of the refactoring and simplification. More speed gains
can be achieved by trying to optimize the code.
Current todo:
- replace functions in zset.{c,h} with a new unified Redis
zset access API.
Once we get the zset interface fixed, we can squash
relevant commits in this branch and have one nice commit
to merge into unstable.
This commit adds:
- Geo commands
- Tests; runnable with: ./runtest --single unit/geo
- Geo helpers in deps/geohash-int/
- src/geo.{c,h} and src/geojson.{c,h} implementing geo commands
- Updated build configurations to get everything working
- TEMPORARY: src/zset.{c,h} implementing zset score and zset
range reading without writing to client output buffers.
- Modified linkage of one t_zset.c function for use in zset.c
Conflicts:
src/Makefile
src/redis.c
Bug as old as Redis and blocking operations. It's hard to trigger since
only happens on instance role switch, but the results are quite bad
since an inconsistency between master and slave is created.
How to trigger the bug is a good description of the bug itself.
1. Client does "BLPOP mylist 0" in master.
2. Master is turned into slave, that replicates from New-Master.
3. Client does "LPUSH mylist foo" in New-Master.
4. New-Master propagates write to slave.
5. Slave receives the LPUSH, the blocked client get served.
Now Master "mylist" key has "foo", Slave "mylist" key is empty.
Highlights:
* At step "2" above, the client remains attached, basically escaping any
check performed during command dispatch: read only slave, in that case.
* At step "5" the slave (that was the master), serves the blocked client
consuming a list element, which is not consumed on the master side.
This scenario is technically likely to happen during failovers, however
since Redis Sentinel already disconnects clients using the CLIENT
command when changing the role of the instance, the bug is avoided in
Sentinel deployments.
Closes#2473.
1. HVSTRLEN -> HSTRLEN. It's unlikely one needs the length of the key,
not clear how the API would work (by value does not make sense) and
there will be better names anyway.
2. Default is to return 0 when field is missing.
3. Default is to return 0 when key is missing.
4. The implementation was slower than needed, and produced unnecessary COW.
Related issue #2415.
Severan problems are addressed but still a few missing.
Since replication of this command was more complex than others since it
needs to replicate multiple SREM commands, an old API able to do this
was reused (it was taken inside the implementation since it was pretty
obvious soon or later that would be useful). The API was improved a bit
so that now a command may opt-out for the standard command replication
when the server.dirty counter is incremented, in order to "manually"
replicate what it wants.
This also makes it backward compatible in the usage, but for the command
name. However the old command name was less obvious so it is worth to
break it probably.
With the new setup the program main can perform argument parsing and
everything else useful for an RDB check regardless of the Redis server
itself.
redis-check-dump is now named redis-check-rdb and it runs
as a mode of redis-server instead of an independent binary.
You can now use 'redis-server redis.conf --check-rdb' to check
the RDB defined in redis.conf. Using argument --check-rdb
checks the RDB and exits. We could potentially also allow
the server to continue starting if the RDB check succeeds.
This change also enables us to use RDB checking programatically
from inside Redis for certain failure conditions.
read() and write() return ssize_t (signed long), not int.
For other offsets, we can use the unsigned size_t type instead
of a signed offset (since our replication offsets and buffer
positions are never negative).
Also explicitly set version to 0, add a protocol version define, improve
comments in the gossip structure.
Note that the structure layout is the same after the change, we are just
making the padding explicit with an additional not used 16 bits field.
So this commit is still able to talk with the previous versions of
cluster nodes.
Adds configuration option 'supervised [no | upstart | systemd | auto]'
Also removed 'bzero' from the previous implementation because it's 2015.
(We could actually statically initialize those structs, but clang
throws an invalid warning when we try, so it looks bad even though it
isn't bad.)
Fixes#2264
This removes:
- list-max-ziplist-entries
- list-max-ziplist-value
This adds:
- list-max-ziplist-size
- list-compress-depth
Also updates config file with new sections and updates
tests to use quicklist settings instead of old list settings.
This replaces individual ziplist vs. linkedlist representations
for Redis list operations.
Big thanks for all the reviews and feedback from everybody in
https://github.com/antirez/redis/pull/2143
Previously, many files had individual main() functions for testing,
but each required being compiled with their own testing flags.
That gets difficult when you have 8 different flags you need
to set just to run all tests (plus, some test files required
other files to be compiled aaginst them, and it seems some didn't
build at all without including the rest of Redis).
Now all individual test main() funcions are renamed to a test
function for the file itself and one global REDIS_TEST define enables
testing across the entire codebase.
Tests can now be run with:
- `./redis-server test <test>`
e.g. ./redis-server test ziplist
If REDIS_TEST is not defined, then no tests get included and no
tests are included in the final redis-server binary.
setTypeRandomElements() now returns unsigned long, and also uses unsigned long for anything related to count of members.
spopWithCountCommand() now uses unsigned long elements_returned instead of int, for values returned from setTypeRandomElements()
spopCommand() now runs spopWithCountCommand() in case the <count> param is found.
Added intsetRandomMembers() to Intset: Copies N random members from the set into inputted 'values' array. Uses either the Knuth or Floyd sample algos depending on ratio count/size.
Added setTypeRandomElements() to SET type: Returns a number of random elements from a non empty set. This is a version of setTypeRandomElement() that is modified in order to return multiple entries, using dictGetRandomKeys() and intsetRandomMembers().
Added tests for SPOP with <count>: unit/type/set, unit/scripting, integration/aof
--
Cleaned up code a bit to match with required Redis coding style
There is no standard cross-platform way of obtaining
system memory info, but I found a useful function
convering all common platforms. I removed support
for uncommon Redis platforms (windows, AIX) and left
others intact.
For more info, see:
http://nadeausoftware.com/articles/2012/09/c_c_tip_how_get_physical_memory_size_system
The system memory info is cached on startup, but some systems
may be able to change the amount of memory visible to Redis
at runtime if Redis is deployed in a VM or container.
Also see #1820
Track bandwidth used by clients and replication (but diskless
replication is not tracked since the actual transfer happens in the
child process).
This includes a refactoring that makes tracking new instantaneous
metrics simpler.
RDB EOF detection was relying on the final part of the RDB transfer to
be a magic 40 bytes EOF marker. However as the slave is put online
immediately, and because of sockets timeouts, the replication stream is
actually contiguous with the RDB file.
This means that to detect the EOF correctly we should either:
1) Scan all the stream searching for the mark. Sucks CPU-wise.
2) Start to send the replication stream only after an acknowledge.
3) Implement a proper chunked encoding.
For now solution "2" was picked, so the master does not start to send
ASAP the stream of commands in the case of diskless replication. We wait
for the first REPLCONF ACK command from the slave, that certifies us
that the slave correctly loaded the RDB file and is ready to get more
data.
Both upstart and systemd provide a way for daemons to
be supervised, as well as a mechanism for them to
signal their readyness status.
This patch provides compatibility with this functionality while
not interfering with other methods.
With this, it will be possible to use `expect stop` with upstart
and `Type=notify` with systemd.
A more detailed explanation of the mechanism can be found here:
http://spootnik.org/entries/2014/11/09_pid-tracking-in-modern-init-systems.html
We need to remember what is the saving strategy of the current RDB child
process, since the configuration may be modified at runtime via CONFIG
SET and still we'll need to understand, when the child exists, what to
do and for what goal the process was initiated: to create an RDB file
on disk or to write stuff directly to slave's sockets.