Hello, I'm forwarding high frequency (800,000 packets per minute) udp packets to 10 other destinations using TC_fanout. I have made all of these optimizations to the server; yet, latency is not exactly where I want it to be. Are there any other settings similar to disabling GRO, LRO, max cpu, rx tx off, rx tx usecs 0 that I'm missing? kernel is 5.15.0-177-generic The code itself works by intercepting incoming UDP packets on a 2 specifc ports and running them through a header rewrite engine that manually updates the Ethernet, IP, and UDP fields. It performs a 1's complement checksum updatein. To achieve the 1-to-10 fanout, the program uses bpf_clone_redirect, which creates packet copies and pushes them out through a bonded interface (bond0). for the other port, of the code, it also utilizes bpf_skb_change_head to manually manage the packet's headroom before re-inserting the Ethernet layer, finally dropping the original packet with TC_ACT_SHOT once all ten clones have been dispatched.…