[Bloat] DETNET

Ken Birman kpb3 at cornell.edu
Sun Nov 19 15:24:59 EST 2017


The Microsoft SIGCOMM paper on deploying RDMA ran into issues with PPF packets (used by the RDMA technology to prevent buffer overruns): on Ethernet (RoCE) they employ an IP packet associated with Enterprise VLAN functionality, and Microsoft disables that feature on their Azure routers.  They work around this by using a different feature of the router: modern datacenter routers support DiffServ, but few data centers use it.  So they enabled Diffserv and split TCP/IP off as one traffic class, and treat RDMA as a second traffic class.  They use DCQCN, which has its own RDMA flow control management, but as a fallback allow the hardware to still generate PPF packets, using the DiffServ header field to encode the needed PPF "pause sending" information.  But to do this they were forced to change some firmware; no idea if this idea of their's will become widely adopted.  Anyhow, with this scheme, the PPF packets protect against cases where DCQCN isn't fast enough to react.  This was all with 100Gbps RDMA.

If I understood your prioritized queuing use case better I might be able to point to something squarely on topic.  If this thread is about a specific scenario, maybe someone could point me to the OP where the scenario was first described?


More information about the Bloat mailing list