not logged in | [Login]
Always use radiusd -X
when debugging!
The network side will keep it's own statistics, e.g. rlm_radius_udp.c
. Except for a few RADIUS-specific things, these are largely protocol-independent:
typedef struct fr_stats_t {
fr_uint_t total_requests;
fr_uint_t total_responses;
fr_uint_t total_invalid_requests;
fr_uint_t total_dup_requests;
fr_uint_t total_malformed_requests;
fr_uint_t total_bad_authenticators;
fr_uint_t total_packets_dropped;
fr_uint_t total_unknown_types;
} fr_stats_t;
i.e.
Any protocol-specific statistics (e.g. Access-Accept vs Access-Challenge) are in a separate protocol-specific data structure, also in the proto_radius
code.
The API to get these statistics can just be a method in app_io
. The worker can grab these stats directly, even if the memory is on the network side. Since the statistics are being incremented only, just doing a memcpy()
of the structure should be OK. Any thread-safety issues can be ignored, as on x86-64, 64-bit writes are atomic.
i.e. the stats API should
memcpy()
the stats from the network side instance to a thread-local variable on the stack
Ideally, we allow the admin to track statistics by any metric. Client IP, listener IP/port, home server IP/port, etc.
For home server, this may mean adding stats capability to the RADIUS client module. My original idea was to have send Access-Request
and recv Access-Accept
sections, in which we could do things like list the stats
module.
The current design has the RADIUS module just send packets directly, and adds all of the reply attributes to request->reply
. Along with that, the reply packet code is unavailable, too. This says to me that we should really make the radius
module have it's own sub-sections that it runs, which gives the admin greater control over the proxied requests and responses. That would also then allow home server stats to be trivially added: list stats
in the appropriate send
or recv
section.
Ideally, we wouldn't need one stats
module per statistic we want to keep. Instead, it should be possible to keep multiple statistics. e.g. by client IP, listener IP, etc.
The `stats module needs to update multiple statistics in one pass. Ideally:
TBH, the simplest thing may be for the module to have an rbtree to track the stats. It looks up the thing (source / dest / virtual server), finds the stats, and increments them. The stats can be queried then by "SRC + key", or "DST + key", or "SERVER + key", or "global"
Since all of the src/dst data is (or should be) available in the request, this shouldn't be too hard to do. We don't need xlats or string parsing. We can just look at request->packet->src_ipaddr
etc. directly.
For destination, we need to look at dst IP + dst port
For dynamic clients, we need to look at NAS-IP-Address
or NAS-Identifier
, instead of src IP
. That's likely an optional additional key for the stats module.
It's probably best to have protocol-specific stats modules, as different statistics will need different things.
via magic. virtual_servers.c
should have a way to call a method across all threads of a module:
coalesce(module_global_data, module_thread_instance)
and then to pass the results back to a particular thread:
finish(module_global_data, module_thread_instance)
That way a module can request that all of the stats be merged from all threads.
We will ignore thread-safety again here... locks are too expensive. i.e. the stats will be updated millions of times per second, but read only once per second. The reader doesn't care if the stats are off by 1, as there will always be "in flight" packets. So this approach is likely correct.
We would likely need protocol-specific modules... so that the stats
module could be listed in a recv Status-Server
section. It would then look for queries, coalesce the stats, and update the reply as appropriate. This would involve largely taking the v3 stats.c
code, and moving it into a module.
The server should only track "live" data. If someone wants historical data, they should query the server periodically.
It also means that the module should not clean up stats for dynamic clients / home servers. Due to data abstraction, the module cannot know if a client has gone away. All it knows is that the client is no longer sending packets.
This also means that there should be a way to delete stats for a particular thing (client, listener, etc.) So the action of a client expiring can (if the admin desires) result in the stats being deleted. Or possibly better, the stats are cleared when a dynamic client is created.
SNMP requires an ID to index logical tables of statistics. This ID has to be contiguous, and stable. I'm not sure how to do this. Especially in the context of dynamic clients, where a client can come and go.
In v3, it just assigns a random number, and tells the caller to check the table contents (e.g. IP address, client ID) for client identity. This is probably the thing to do here.
We should produce stats in one form, and then rely on "unlang" to get them to different attributes as necessary.
e.g. old-style stats, or SNMP stats.
Last edited by Arran Cudbard-Bell, 2018-06-13 22:51:48
Sponsored by Network RADIUS