This is an attempt to use the refcount feature of the sds.c fork
provided in the Pull Request #2509. A new type, SDS_TYPE_5 is introduced
having a one byte header with just the string length, without
information about the available additional length at the end of the
string (this means that sdsMakeRoomFor() will be required each time
we want to append something, since the string will always report to have
0 bytes available).
More work needed in order to avoid common SDS functions will pay the
cost of this type. For example both sdscatprintf() and sdscatfmt()
should try to upgrade to SDS_TYPE_8 ASAP when appending chars.
The command reports information about the hash table internal state
representing the specified database ID.
This can be used in order to investigate rehashings, memory usage issues
and for other debugging purposes.
The new return value is the number of keys existing, among the ones
specified in the command line, counting the same key multiple times if
given multiple times (and if it exists).
See PR #2667.
Rationale:
1. The commands look like internals exposed without a real strong use
case.
2. Whatever there is an use case, the client would implement the
commands client side instead of paying RTT just to use a simple to
reimplement library.
3. They add complexity to an otherwise quite straightforward API.
So for now KILLED ;-)
Instead of successive divisions in iteration the new code uses bitwise
magic to interleave / deinterleave two 32bit values into a 64bit one.
All tests still passing and is measurably faster, so worth it.
Stack traces produced by Redis on crash are the most useful tool we
have to fix non easily reproducible crashes, or even easily reproducible
ones where the user just posts a bug report and does not collaborate
furhter.
By declaring functions "static" they no longer show up in the stack
trace.
If GEOENCODE must be our door to enter the Geocoding implementation
details and do fancy things client side, than return the scores as well
so that we can query the sorted sets directly if we wish to do the same
search multiple times, or want to compute the boxes in the client side
to refine our search needs.
The GIS standard and all the major DBs implementing GIS related
functions take coordinates as x,y that is longitude,latitude.
It was a bad start for Redis to do things differently, so even if this
means that existing users of the Geo module will be required to change
their code, Redis now conforms to the standard.
Usually Redis is very backward compatible, but this is not an exception
to this rule, since this is the first Geo implementation entering the
official Redis source code. It is not wise to try to be backward
compatible with code forks... :-)
Close#2637.
The returned step was in some case not enough towards normal
coordinates (for example when our search position was was already near the
margin of the central area, and we had to match, using the east or west
neighbor, a very far point). Example:
geoadd points 67.575457940146066 -62.001317572780565 far
geoadd points 66.685439060295664 -58.925040587282297 center
georadius points 66.685439060295664 -58.925040587282297 200 km
In the above case the code failed to find a match (happens at smaller
latitudes too) even if far and center are at less than 200km.
Another fix introduced by this commit is a progressively larger area
towards the poles, since meridians are a lot less far away, so we need
to compensate for this.
The current implementation works comparably to the Tcl brute-force
stress tester implemented in the fuzzy test in the geo.tcl unit for
latitudes between -70 and 70, and is pretty accurate over +/-80 too,
with sporadic false negatives.
A more mathematically clean implementation is possible by computing the
meridian distance at the specified latitude and computing the step
according to it.
We set random points in the world, pick a random position, and check if
the returned points by Redis match the ones computed by Tcl by brute
forcing all the points using the distance between two points formula.
This approach is sounding since immediately resulted in finding a bug in
the original implementation.
1. We no longer use a fake client but just rewriting.
2. We group all the inserts into a single ZADD dispatch (big speed win).
3. As a side effect of the correct implementation, replication works.
4. The return value of the command is now correct.
This commit simplifies the implementation in a few ways:
1. zsetScore implementation improved a bit and moved into t_zset.c where
is now also used to implement the ZSCORE command.
2. Range extraction from the sorted set remains a separated
implementation from the one in t_zset.c, but was hyper-specialized in
order to avoid accumulating results into a list and remove the ones
outside the radius.
3. A new type is introduced: geoArray, which can accumulate geoPoint
structures in a vector with power of two expansion policy. This is
useful since we have to call qsort() against it before returning the
result to the user.
4. As a result of 1, 2, 3, the two files zset.c and zset.h are now
removed, including the function to merge two lists (now handled with
functions that can add elements to existing geoArray arrays) and
the machinery used in order to pass zset results.
5. geoPoint structure simplified because of the general code structure
simplification, so we no longer need to take references to objects.
6. Not counting the JSON removal the refactoring removes 200 lines of
code for the same functionalities, with a simpler to read
implementation.
7. GEORADIUS is now 2.5 times faster testing with 10k elements and a
radius resulting in 124 elements returned. However this is mostly a
side effect of the refactoring and simplification. More speed gains
can be achieved by trying to optimize the code.
For some reason the Geo PR included disabling the fact that Redis is
compiled with optimizations. Apparently it was just @mattsta attempt to
speedup the modify-compile-test iteration and there are no other
reasons.
This feature apparently is not going to be very useful, to send a
GEOADD+PUBLISH combo is exactly the same. One that would make a ton of
difference is the ability to subscribe to a position and a radius, and
get the updates in terms of objects entering/exiting the area.