[Cake] [PATCH net-next 1/5] sch_cake: fix IP protocol handling in the presence of VLAN tags

Toke Høiland-Jørgensen toke at redhat.com
Fri Jun 26 08:52:36 EDT 2020

Davide Caratti <dcaratti at redhat.com> writes:

> hello,
> my 2 cents:
> On Thu, 2020-06-25 at 21:53 +0200, Toke Høiland-Jørgensen wrote:
>> I think it depends a little on the use case; some callers actually care
>> about the VLAN tags themselves and handle that specially (e.g.,
>> act_csum).
> I remember taht something similar was discussed about 1 year ago [1].

Ah, thank you for the pointer!

>> Whereas others (e.g., sch_dsmark) probably will have the same
>> issue.
> I'd say that the issue "propagates" to all qdiscs that mangle the ECN-CE
> bit (i.e., calling INET_ECN_set_ce() [2]), most notably all the RED
> variants and "codel/fq_codel".

Yeah, I think we should fix INET_ECN_set_ce() instead of re-implementing
it in CAKE. See below, though.

>>  I guess I can trying going through them all and figuring out if
>> there's a more generic solution.
> For sch_cake, I think that the qdisc shouldn't look at the IP header when
> it schedules packets having a VLAN tag.
> Probably, when tc_skb_protocol() returns ETH_P_8021Q or ETH_P_8021AD, we
> should look at the VLAN priority (PCP) bits (and that's something that
> cake doesn't do currently - but I have a small patch in my stash that
> implements this: please let me know if you are interested in seeing it :)
> ).
> Then, to ensure that the IP precedence is respected, even with different
> VLAN tags, users should explicitly configure TC filters that "map" the
> DSCP value to a PCP value. This would ensure that configured priority is
> respected by the scheduler, and would also be flexible enough to allow
> different "mappings".

I think you have this the wrong way around :)

I.e., classifying based on VLAN priority is even more esoteric than
using diffserv markings, so that should not be the default. Making it
the default would also make the behaviour change for the same traffic if
there's a VLAN tag present, which is bound to confuse people. I suppose
it could be an option, but not really sure that's needed, since as you
say you could just implement it with your own TC filters...

> Sure, my proposal does not cover the problem of mangling the CE bit
> inside VLAN-tagged packets, i.e. if we should understand if qdiscs
> should allow it or not.

Hmm, yeah, that's the rub, isn't it? I think this is related to this
commit, which first introduced tc_skb_protocol():

d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")

That commit at least made the behaviour consistent across
accelerated/non-accelerated VLANs. However, the patch description
asserts that 'tc code .. expects vlan protocol type [in skb->protocol]'.
Looking at the various callers, I'm not actually sure that's true, in
the sense that most of the callers don't handle VLAN ethertypes at all,
but expects to find an IP header. This is the case for at least:

- act_ctinfo
- act_skbedit
- cls_flow
- em_ipset
- em_ipt
- sch_cake
- sch_dsmark

In fact the only caller that explicitly handles a VLAN ethertype seems
to be act_csum; and that handles it in a way that also just skips the
VLAN headers, albeit by skb_pull()'ing the header.

cls_api, em_meta and sch_teql don't explicitly handle it; but returning
the VLAN ethertypes to those does seem to make sense, since they just
pass the value somewhere else.

So my suggestion would be to add a new helper that skips the VLAN tags
and finds the L3 ethertype (i.e., basically cake_skb_proto() as
introduced in this patch), then switch all the callers listed above, as
well as the INET_ECN_set_ce() over to using that. Maybe something like
'skb_l3_protocol()' which could go into skbuff.h itself, so the ECN code
can also find it?

Any objections to this? It's not actually clear to me how the discussion
you quoted above landed; but this will at least make things consistent
across all the different actions/etc.

Adding in Jiri and Jamal as well since they were involved in the patch I
quoted above.


More information about the Cake mailing list