Yes, I agree the assumptions are key here. One key aspect of this paper is that it focuses on the steady-state behavior of bulk flows.
Once you allow for short flows (like web pages, RPCs, etc) to dynamically enter and leave a bottleneck, the considerations become different. As is well-known, Reno/CUBIC will starve themselves if new flows enter and cause loss too frequently. For CUBIC, for a somewhat typical 30ms broadband path with a flow fair share of 25 Mbit/sec, if new flows enter and cause loss more frequently than roughly every 2 seconds then CUBIC will not be able to utilize its fair share. For a high-speed WAN path, with 100ms RTT and fair share of 10 Gbit/sec, if new flows enter and cause loss more frequently than roughly every 40 seconds then CUBIC will not be able to utilize its fair share. Basically, loss-based CC can starve itself in some very typical kinds of dynamic scenarios that happen in the real world.
BBR is not trying to maintain a higher throughput than CUBIC in these kinds of scenarios with steady-state bulk flows. BBR is trying to be robust to the kinds of random packet loss that happen in the real world when there are flows dynamically entering/leaving a bottleneck.
cheers,
neal