redict/tests/unit
Meir Shpilraien (Spielrein) aa856b39f2
Sort out the mess around Lua error messages and error stats (#10329)
This PR fix 2 issues on Lua scripting:
* Server error reply statistics (some errors were counted twice).
* Error code and error strings returning from scripts (error code was missing / misplaced).

## Statistics
a Lua script user is considered part of the user application, a sophisticated transaction,
so we want to count an error even if handled silently by the script, but when it is
propagated outwards from the script we don't wanna count it twice. on the other hand,
if the script decides to throw an error on its own (using `redis.error_reply`), we wanna
count that too.
Besides, we do count the `calls` in command statistics for the commands the script calls,
we we should certainly also count `failed_calls`.
So when a simple `eval "return redis.call('set','x','y')" 0` fails, it should count the failed call
to both SET and EVAL, but the `errorstats` and `total_error_replies` should be counted only once.

The PR changes the error object that is raised on errors. Instead of raising a simple Lua
string, Redis will raise a Lua table in the following format:

```
{
    err='<error message (including error code)>',
    source='<User source file name>',
    line='<line where the error happned>',
    ignore_error_stats_update=true/false,
}
```

The `luaPushError` function was modified to construct the new error table as describe above.
The `luaRaiseError` was renamed to `luaError` and is now simply called `lua_error` to raise
the table on the top of the Lua stack as the error object.
The reason is that since its functionality is changed, in case some Redis branch / fork uses it,
it's better to have a compilation error than a bug.

The `source` and `line` fields are enriched by the error handler (if possible) and the
`ignore_error_stats_update` is optional and if its not present then the default value is `false`.
If `ignore_error_stats_update` is true, the error will not be counted on the error stats.

When parsing Redis call reply, each error is translated to a Lua table on the format describe
above and the `ignore_error_stats_update` field is set to `true` so we will not count errors
twice (we counted this error when we invoke the command).

The changes in this PR might have been considered as a breaking change for users that used
Lua `pcall` function. Before, the error was a string and now its a table. To keep backward
comparability the PR override the `pcall` implementation and extract the error message from
the error table and return it.

Example of the error stats update:

```
127.0.0.1:6379> lpush l 1
(integer) 2
127.0.0.1:6379> eval "return redis.call('get', 'l')" 0
(error) WRONGTYPE Operation against a key holding the wrong kind of value. script: e471b73f1ef44774987ab00bdf51f21fd9f7974a, on @user_script:1.

127.0.0.1:6379> info Errorstats
# Errorstats
errorstat_WRONGTYPE:count=1

127.0.0.1:6379> info commandstats
# Commandstats
cmdstat_eval:calls=1,usec=341,usec_per_call=341.00,rejected_calls=0,failed_calls=1
cmdstat_info:calls=1,usec=35,usec_per_call=35.00,rejected_calls=0,failed_calls=0
cmdstat_lpush:calls=1,usec=14,usec_per_call=14.00,rejected_calls=0,failed_calls=0
cmdstat_get:calls=1,usec=10,usec_per_call=10.00,rejected_calls=0,failed_calls=1
```

## error message
We can now construct the error message (sent as a reply to the user) from the error table,
so this solves issues where the error message was malformed and the error code appeared
in the middle of the error message:

```diff
127.0.0.1:6379> eval "return redis.call('set','x','y')" 0
-(error) ERR Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479): @user_script:1: OOM command not allowed when used memory > 'maxmemory'.
+(error) OOM command not allowed when used memory > 'maxmemory' @user_script:1. Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479)
```

```diff
127.0.0.1:6379> eval "redis.call('get', 'l')" 0
-(error) ERR Error running script (call to f_8a705cfb9fb09515bfe57ca2bd84a5caee2cbbd1): @user_script:1: WRONGTYPE Operation against a key holding the wrong kind of value
+(error) WRONGTYPE Operation against a key holding the wrong kind of value script: 8a705cfb9fb09515bfe57ca2bd84a5caee2cbbd1, on @user_script:1.
```

Notica that `redis.pcall` was not change:
```
127.0.0.1:6379> eval "return redis.pcall('get', 'l')" 0
(error) WRONGTYPE Operation against a key holding the wrong kind of value
```


## other notes
Notice that Some commands (like GEOADD) changes the cmd variable on the client stats so we
can not count on it to update the command stats. In order to be able to update those stats correctly
we needed to promote `realcmd` variable to be located on the client struct.

Tests was added and modified to verify the changes.

Related PR's: #10279, #10218, #10278, #10309

Co-authored-by: Oran Agra <oran@redislabs.com>
2022-02-27 13:40:57 +02:00
..
moduleapi Implemented module getchannels api and renamed channel keyspec (#10299) 2022-02-22 11:00:03 +02:00
type Add stream consumer group lag tracking and reporting (#9127) 2022-02-23 22:34:58 +02:00
acl-v2.tcl Set default channel permission to resetchannels for 7.0 (#10181) 2022-01-30 12:02:55 +02:00
acl.tcl Add tests for ACL command error cases (#10183) 2022-02-06 07:58:28 +02:00
aofrw.tcl Added AOF rewrite support for functions. (#10141) 2022-01-19 21:21:42 +02:00
auth.tcl Add AUTH arity test (#10266) 2022-02-09 22:09:20 +02:00
bitfield.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
bitops.tcl Change lzf to handle values larger than UINT32_MAX (#9776) 2021-11-16 13:12:25 +02:00
client-eviction.tcl introduce dynamic client reply buffer size - save memory on idle clients (#9822) 2022-02-22 11:19:38 +02:00
cluster.tcl Function Flags support (no-writes, no-cluster, allow-state, allow-oom) (#10066) 2022-01-14 14:02:02 +02:00
dump.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
expire.tcl sub-command support for ACL CAT and COMMAND LIST. redisCommand always stores fullname (#10127) 2022-01-23 10:05:06 +02:00
functions.tcl Fix wrong version calculation on Redis Function tests. (#10217) 2022-01-31 12:49:57 +02:00
geo.tcl Fix geo search bounding box check causing missing results (#10018) 2022-02-21 08:06:58 +02:00
hyperloglog.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
info-command.tcl Make INFO command variadic (#6891) 2022-02-08 13:14:42 +02:00
info.tcl Fix error stats and failed command stats for blocked clients (#10309) 2022-02-21 11:20:41 +02:00
introspection-2.tcl Sort out the mess around Lua error messages and error stats (#10329) 2022-02-27 13:40:57 +02:00
introspection.tcl introduce dynamic client reply buffer size - save memory on idle clients (#9822) 2022-02-22 11:19:38 +02:00
keyspace.tcl Add external test that runs without debug command (#9964) 2021-12-19 17:41:51 +02:00
latency-monitor.tcl sub-command support for ACL CAT and COMMAND LIST. redisCommand always stores fullname (#10127) 2022-01-23 10:05:06 +02:00
lazyfree.tcl attempt to fix tracking test issue with external tests due to lazy free (#9722) 2021-11-02 16:42:53 +02:00
limits.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
maxmemory.tcl Added INFO LATENCYSTATS section: latency by percentile distribution/latency by cumulative distribution of latencies (#9462) 2022-01-05 14:01:05 +02:00
memefficiency.tcl Fix script active defrag test (#10318) 2022-02-21 09:37:25 +02:00
multi.tcl Fix timing issue in EXEC fail on lazy expired WATCHed key test (#10332) 2022-02-23 08:47:16 +02:00
networking.tcl Protected configs and sensitive commands (#9920) 2021-12-19 10:46:16 +02:00
obuf-limits.tcl Added INFO LATENCYSTATS section: latency by percentile distribution/latency by cumulative distribution of latencies (#9462) 2022-01-05 14:01:05 +02:00
oom-score-adj.tcl Don't write oom score adj to proc unless we're managing it. (#9904) 2021-12-07 16:05:51 +02:00
other.tcl Fix crash when error [sub]command name contains | (#10082) 2022-01-09 13:06:51 +02:00
pause.tcl sub-command support for ACL CAT and COMMAND LIST. redisCommand always stores fullname (#10127) 2022-01-23 10:05:06 +02:00
pendingquerybuf.tcl Introduce memory management on cluster link buffers (#9774) 2021-12-16 21:56:59 -08:00
printver.tcl Print version info before running the test 2011-05-20 11:44:54 +02:00
protocol.tcl add test suite infra to test RESP3 attributes (#10247) 2022-02-07 00:10:05 +02:00
pubsub.tcl Connection leak in external tests. (#9777) 2021-11-15 11:07:43 +02:00
pubsubshard.tcl Sharded pubsub implementation (#8621) 2022-01-02 16:54:47 -08:00
querybuf.tcl Ignore resize threshold on idle qbuf resizing (#9322) 2021-08-06 20:50:34 +03:00
quit.tcl Add tests for OK on QUIT 2010-10-15 12:54:53 +02:00
replybufsize.tcl introduce dynamic client reply buffer size - save memory on idle clients (#9822) 2022-02-22 11:19:38 +02:00
scan.tcl Replace all usage of ziplist with listpack for t_zset (#9366) 2021-09-09 18:18:53 +03:00
scripting.tcl Sort out the mess around Lua error messages and error stats (#10329) 2022-02-27 13:40:57 +02:00
shutdown.tcl Wait for replicas when shutting down (#9872) 2022-01-02 09:50:15 +02:00
slowlog.tcl Redact ACL SETUSER arguments if the user has spaces (#9935) 2021-12-13 08:39:04 -08:00
sort.tcl Add SORT_RO command (#9299) 2021-08-09 09:40:29 +03:00
tls.tcl Add support for reading encrypted keyfiles. (#8644) 2021-03-22 13:27:46 +02:00
tracking.tcl Solve issues with tracking test in external mode (#9726) 2021-11-02 16:07:51 -07:00
violations.tcl Fix possible int overflow when hashing an sds. (#9916) 2021-12-13 21:16:25 +02:00
wait.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00