Thanks Stuart this is helpful. I'm measuring the time just before the first write() (of potentially a burst of writes to achieve a burst size) per a socket fd's select event occurring when TCP_NOT_SENT_LOWAT being set to a small value, then sampling the RTT and CWND and providing histograms for all three, all on that event. I'm not sure the correctness of RTT and CWND at this sample point. This is a controlled test over 802.11ax and OFDMA where the TCP acks per the WiFi clients are being scheduled by the AP using 802.11ax trigger frames so the AP is affecting the end/end BDP per scheduling the transmits and the acks. The AP can grow the BDP or shrink it based on these scheduling decisions. From there we're trying to maximize network power (throughput/delay) for elephant flows and just latency for mouse flows. (We also plan some RF frequency stuff to per OFDMA) Anyway, the AP based scheduling along with aggregation and OFDMA makes WiFi scheduling optimums non-obvious - at least to me - and I'm trying to provide insights into how an AP is affecting end/end performance.
The more direct approach for e2e TCP latency and network power has been to measure first write() to final read() and compute the e2e delay. This requires clock sync on the ends. (We're using ptp4l with GPS OCXO atomic references for that but this is typically only available in some labs.)
Bob