Every node in the path has to implement this to be effective.
It would certainly be optimal if every node implemented it. But any node can detect the endpoint round-trip time, and how it degrades, and thus adjust how fast it feeds packets into the network. And a midpoint can detect when a host is feeding packets too fast for downstream nodes, and send explicit congestion notification, and failing that, drop some packets from that source.