* [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood [not found] ` <CAA93jw6Aj3Rcsm=Q=KZVrW_TGThVwu6pRAN3nNQ4tvSODY_zUg@mail.gmail.com> @ 2016-05-06 4:35 ` Dave Taht 2016-05-06 4:44 ` Jonathan Morton 0 siblings, 1 reply; 15+ messages in thread From: Dave Taht @ 2016-05-06 4:35 UTC (permalink / raw) To: cake this would be a pretty nifty feature for cake to have in this hostile universe. ---------- Forwarded message ---------- From: Dave Taht <dave.taht@gmail.com> Date: Thu, May 5, 2016 at 11:33 AM Subject: Re: [Codel] fq_codel_drop vs a udp flood To: Jonathan Morton <chromatix99@gmail.com> Cc: Roman Yeryomin <leroi.lists@gmail.com>, Eric Dumazet <eric.dumazet@gmail.com>, make-wifi-fast@lists.bufferbloat.net, "codel@lists.bufferbloat.net" <codel@lists.bufferbloat.net>, ath10k <ath10k@lists.infradead.org> On Thu, May 5, 2016 at 9:59 AM, Jonathan Morton <chromatix99@gmail.com> wrote: >> Having same (low) speeds. >> So it didn't help at all :( > > Although the new “emergency drop” code is now dropping batches of consecutive packets, Codel is also still dropping individual packets in between these batches, probably at a high rate. Since all fragments of an original packet are required to reassemble it, but Codel doesn’t link related fragments when deciding to drop, each fragment lost in this way reduces throughput efficiency. Only a fraction of the original packets can be reassembled correctly, but the surviving (yet useless) fragments still occupy link capacity. I could see an AQM dropper testing to see if it is dropping a frag, and then dropping any further fragments, also. We're looking at the IP headers anyway in that section of the code, and the decision to drop is (usually) rare, and fragments a PITA. > This phenomenon is not Codel specific; I would also expect to see it on most other AQMs, and definitely on RED variants, including PIE. Fortunately for real traffic, it normally arises only on artificial traffic such as iperf runs with large UDP packets. Unfortunately for AQM advocates, iperf uses large UDP packets by default, and it is very easy to misinterpret the results unfavourably for AQM (as opposed to unfavourably for iperf). > > If you re-run the test with iperf set to a packet size compatible with the path MTU, you should see much better throughput numbers due to the elimination of fragmented packets. A UDP payload size of 1280 bytes is a safe, conservative figure for a normal MTU in the vicinity of 1500. > >> Limit of 1024 packets and 1024 flows is not wise I think. >> >> (If all buckets are in use, each bucket has a virtual queue of 1 packet, >> which is almost the same than having no queue at all) > > This, while theoretically important in extreme cases with very large numbers of flows, is not relevant to the specific test in question. > > - Jonathan Morton > -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 4:35 ` [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood Dave Taht @ 2016-05-06 4:44 ` Jonathan Morton 2016-05-06 4:57 ` Dave Taht 2016-05-06 8:49 ` moeller0 0 siblings, 2 replies; 15+ messages in thread From: Jonathan Morton @ 2016-05-06 4:44 UTC (permalink / raw) To: Dave Taht; +Cc: cake > On 6 May, 2016, at 07:35, Dave Taht <dave.taht@gmail.com> wrote: > > this would be a pretty nifty feature for cake to have in this hostile universe. Yes, but difficult to implement since the trailing fragments lose the proto/port information, and thus get sorted into a different queue than the leading fragment. We would essentially need to implement the same tracking mechanisms as for actual reassembly. - Jonathan Morton ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 4:44 ` Jonathan Morton @ 2016-05-06 4:57 ` Dave Taht 2016-05-06 8:49 ` moeller0 1 sibling, 0 replies; 15+ messages in thread From: Dave Taht @ 2016-05-06 4:57 UTC (permalink / raw) To: Jonathan Morton; +Cc: cake On Thu, May 5, 2016 at 9:44 PM, Jonathan Morton <chromatix99@gmail.com> wrote: > >> On 6 May, 2016, at 07:35, Dave Taht <dave.taht@gmail.com> wrote: >> >> this would be a pretty nifty feature for cake to have in this hostile universe. > > Yes, but difficult to implement since the trailing fragments lose the proto/port information, and thus get sorted into a different queue than the leading fragment. We would essentially need to implement the same tracking mechanisms as for actual reassembly. No. At least in the iperf3 case you end up with 3 trailing fragments in their own queue for every first fragment in another queue. Nuking everything once in drop mode with "more fragments" set or a non-zero fragment offset field will do some good. https://en.wikipedia.org/wiki/IPv4#Fragmentation_and_reassembly In the netperf case (which does 64k fragments), even better. And against your typical fragmentation attack, dunno, but all and all it strikes me as a measurable win. > > - Jonathan Morton > -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 4:44 ` Jonathan Morton 2016-05-06 4:57 ` Dave Taht @ 2016-05-06 8:49 ` moeller0 2016-05-06 9:00 ` David Lang 1 sibling, 1 reply; 15+ messages in thread From: moeller0 @ 2016-05-06 8:49 UTC (permalink / raw) To: Jonathan Morton; +Cc: Dave Täht, cake Hi Jonathan, > On May 6, 2016, at 06:44 , Jonathan Morton <chromatix99@gmail.com> wrote: > > >> On 6 May, 2016, at 07:35, Dave Taht <dave.taht@gmail.com> wrote: >> >> this would be a pretty nifty feature for cake to have in this hostile universe. > > Yes, but difficult to implement since the trailing fragments lose the proto/port information, and thus get sorted into a different queue than the leading fragment. We would essentially need to implement the same tracking mechanisms as for actual reassembly. But the receiver needs to be able to re-segment the fragments so all required information needs to be there; what about looking at src and dst address and the MF flag in the header as well as the fragment offset and scrape proto/port from the leading fragment and “virtually” apply it to all following fragments, that way cake will do the right thing. All of this might be too costly in implementation and computation to be feasible… Best Regards Sebastian > > - Jonathan Morton > > _______________________________________________ > Cake mailing list > Cake@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cake ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 8:49 ` moeller0 @ 2016-05-06 9:00 ` David Lang 2016-05-06 9:36 ` moeller0 2016-05-06 15:31 ` Stephen Hemminger 0 siblings, 2 replies; 15+ messages in thread From: David Lang @ 2016-05-06 9:00 UTC (permalink / raw) To: moeller0; +Cc: Jonathan Morton, cake [-- Attachment #1: Type: TEXT/PLAIN, Size: 1259 bytes --] On Fri, 6 May 2016, moeller0 wrote: > Hi Jonathan, > >> On May 6, 2016, at 06:44 , Jonathan Morton <chromatix99@gmail.com> wrote: >> >> >>> On 6 May, 2016, at 07:35, Dave Taht <dave.taht@gmail.com> wrote: >>> >>> this would be a pretty nifty feature for cake to have in this hostile universe. >> >> Yes, but difficult to implement since the trailing fragments lose the proto/port information, and thus get sorted into a different queue than the leading fragment. We would essentially need to implement the same tracking mechanisms as for actual reassembly. > > But the receiver needs to be able to re-segment the fragments so all required information needs to be there; what about looking at src and dst address and the MF flag in the header as well as the fragment offset and scrape proto/port from the leading fragment and “virtually” apply it to all following fragments, that way cake will do the right thing. All of this might be too costly in implementation and computation to be feasible… wait a minute here. If the fragments are going to go over the network as separate packets, each fragment must include source/dest ip and source/dest port, otherwise the recipient isn't going to be able to figure out what to do with it. David Lang ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 9:00 ` David Lang @ 2016-05-06 9:36 ` moeller0 2016-05-06 15:31 ` Stephen Hemminger 1 sibling, 0 replies; 15+ messages in thread From: moeller0 @ 2016-05-06 9:36 UTC (permalink / raw) To: David Lang; +Cc: Jonathan Morton, cake > On May 6, 2016, at 11:00 , David Lang <david@lang.hm> wrote: > > On Fri, 6 May 2016, moeller0 wrote: > >> Hi Jonathan, >> >>> On May 6, 2016, at 06:44 , Jonathan Morton <chromatix99@gmail.com> wrote: >>>> On 6 May, 2016, at 07:35, Dave Taht <dave.taht@gmail.com> wrote: >>>> this would be a pretty nifty feature for cake to have in this hostile universe. >>> Yes, but difficult to implement since the trailing fragments lose the proto/port information, and thus get sorted into a different queue than the leading fragment. We would essentially need to implement the same tracking mechanisms as for actual reassembly. >> >> But the receiver needs to be able to re-segment the fragments so all required information needs to be there; what about looking at src and dst address and the MF flag in the header as well as the fragment offset and scrape proto/port from the leading fragment and “virtually” apply it to all following fragments, that way cake will do the right thing. All of this might be too costly in implementation and computation to be feasible… > > wait a minute here. If the fragments are going to go over the network as separate packets, each fragment must include source/dest ip and source/dest port, otherwise the recipient isn’t going to be able to figure out what to do with it. That is what I thought as well, but as I understand now fragmentation happens on the IP level independent of the “payload” so fragmentation is all the same for UDP/TCP/ICMP. According to https://en.wikipedia.org/wiki/IPv4#Fragmentation_and_reassembly all packets in a fragment group should have the same IP identification value, so matching fragmented packets should be even easier, just use the SRCIP, DSTIP PROTOCOL IDENTIFICATION quadruple (all values that live in the IP header, or use these values to find the matching port from the first fragments protocol header… For sanity checking one might even require for all but the last packet to have the MF flag set and the fragment offsets to be monotonically increasing. But this will require to at least look at the MF flag to notice fragments at all… But I guess https://tools.ietf.org/html/rfc6864 says all of this more distinctively… Best Regards Sebastian > > David Lang ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 9:00 ` David Lang 2016-05-06 9:36 ` moeller0 @ 2016-05-06 15:31 ` Stephen Hemminger 2016-05-06 18:50 ` David Lang 1 sibling, 1 reply; 15+ messages in thread From: Stephen Hemminger @ 2016-05-06 15:31 UTC (permalink / raw) To: David Lang; +Cc: moeller0, cake On Fri, 6 May 2016 02:00:02 -0700 (PDT) David Lang <david@lang.hm> wrote: > On Fri, 6 May 2016, moeller0 wrote: > > > Hi Jonathan, > > > >> On May 6, 2016, at 06:44 , Jonathan Morton <chromatix99@gmail.com> wrote: > >> > >> > >>> On 6 May, 2016, at 07:35, Dave Taht <dave.taht@gmail.com> wrote: > >>> > >>> this would be a pretty nifty feature for cake to have in this hostile universe. > >> > >> Yes, but difficult to implement since the trailing fragments lose the proto/port information, and thus get sorted into a different queue than the leading fragment. We would essentially need to implement the same tracking mechanisms as for actual reassembly. > > > > But the receiver needs to be able to re-segment the fragments so all required information needs to be there; what about looking at src and dst address and the MF flag in the header as well as the fragment offset and scrape proto/port from the leading fragment and “virtually” apply it to all following fragments, that way cake will do the right thing. All of this might be too costly in implementation and computation to be feasible… > > wait a minute here. If the fragments are going to go over the network as > separate packets, each fragment must include source/dest ip and source/dest > port, otherwise the recipient isn't going to be able to figure out what to do > with it. > > David Lang Fragments are reassembled by IP id, not src/dest port. Only the first fragment has the L4 header with src/dest port, all the rest are just data. That is why most firewalls reassemble all packets (and then refragment as needed) to allow matching on port values. For several cases where flow information is necessary most code does: flowid = is_fragementd(ip) ? ip->id : hash(ip + tcp) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 15:31 ` Stephen Hemminger @ 2016-05-06 18:50 ` David Lang 2016-05-06 18:53 ` Jonathan Morton 2016-05-06 23:14 ` Benjamin Cronce 0 siblings, 2 replies; 15+ messages in thread From: David Lang @ 2016-05-06 18:50 UTC (permalink / raw) To: Stephen Hemminger; +Cc: moeller0, cake [-- Attachment #1: Type: TEXT/PLAIN, Size: 2035 bytes --] On Fri, 6 May 2016, Stephen Hemminger wrote: > On Fri, 6 May 2016 02:00:02 -0700 (PDT) > David Lang <david@lang.hm> wrote: > >> On Fri, 6 May 2016, moeller0 wrote: >> >>> Hi Jonathan, >>> >>>> On May 6, 2016, at 06:44 , Jonathan Morton <chromatix99@gmail.com> wrote: >>>> >>>> >>>>> On 6 May, 2016, at 07:35, Dave Taht <dave.taht@gmail.com> wrote: >>>>> >>>>> this would be a pretty nifty feature for cake to have in this hostile universe. >>>> >>>> Yes, but difficult to implement since the trailing fragments lose the proto/port information, and thus get sorted into a different queue than the leading fragment. We would essentially need to implement the same tracking mechanisms as for actual reassembly. >>> >>> But the receiver needs to be able to re-segment the fragments so all required information needs to be there; what about looking at src and dst address and the MF flag in the header as well as the fragment offset and scrape proto/port from the leading fragment and “virtually” apply it to all following fragments, that way cake will do the right thing. All of this might be too costly in implementation and computation to be feasible… >> >> wait a minute here. If the fragments are going to go over the network as >> separate packets, each fragment must include source/dest ip and source/dest >> port, otherwise the recipient isn't going to be able to figure out what to do >> with it. >> >> David Lang > > Fragments are reassembled by IP id, not src/dest port. > Only the first fragment has the L4 header with src/dest port, > all the rest are just data. > > That is why most firewalls reassemble all packets (and then refragment as needed) > to allow matching on port values. actually, many firewalls do not reassemble packets, they pass packets through without reassembly. what IP id are you referring to? I don't remember any such field in the packet header. David Lang > For several cases where flow information is necessary most code does: > flowid = is_fragementd(ip) ? ip->id : hash(ip + tcp) > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 18:50 ` David Lang @ 2016-05-06 18:53 ` Jonathan Morton 2016-05-06 19:14 ` David Lang 2016-05-06 23:14 ` Benjamin Cronce 1 sibling, 1 reply; 15+ messages in thread From: Jonathan Morton @ 2016-05-06 18:53 UTC (permalink / raw) To: David Lang; +Cc: Stephen Hemminger, cake > On 6 May, 2016, at 21:50, David Lang <david@lang.hm> wrote: > > what IP id are you referring to? I don't remember any such field in the packet header. It’s the third halfword. - Jonathan Morton ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 18:53 ` Jonathan Morton @ 2016-05-06 19:14 ` David Lang 2016-05-06 19:33 ` Jonathan Morton 0 siblings, 1 reply; 15+ messages in thread From: David Lang @ 2016-05-06 19:14 UTC (permalink / raw) To: Jonathan Morton; +Cc: Stephen Hemminger, cake [-- Attachment #1: Type: TEXT/PLAIN, Size: 825 bytes --] On Fri, 6 May 2016, Jonathan Morton wrote: >> On 6 May, 2016, at 21:50, David Lang <david@lang.hm> wrote: >> >> what IP id are you referring to? I don't remember any such field in the packet header. > > It’s the third halfword. half a word is hardly enough to be unique across the Internet, anything that small would lead to lots of attackes that inserted garbage data into threads. looking at the IP header, the minimum size header includes fragment offset, source and destination IP addresses, and I'd bet a lot of money that every fragment of TCP/UDP includes the port numbers as well because there is just not enough into in the 20 byte header to identify what it matches with. and I don't see this field you are talking about. http://www.erg.abdn.ac.uk/users/gorry/course/inet-pages/ip-packet.html David Lang ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 19:14 ` David Lang @ 2016-05-06 19:33 ` Jonathan Morton 2016-05-06 19:54 ` David Lang 0 siblings, 1 reply; 15+ messages in thread From: Jonathan Morton @ 2016-05-06 19:33 UTC (permalink / raw) To: David Lang; +Cc: Stephen Hemminger, cake > On 6 May, 2016, at 22:14, David Lang <david@lang.hm> wrote: > > On Fri, 6 May 2016, Jonathan Morton wrote: > >>> On 6 May, 2016, at 21:50, David Lang <david@lang.hm> wrote: >>> >>> what IP id are you referring to? I don't remember any such field in the packet header. >> >> It’s the third halfword. > > half a word is hardly enough to be unique across the Internet, anything that small would lead to lots of attackes that inserted garbage data into threads. It doesn’t need to be globally unique. It merely identifies, in conjunction with src/dst address pair (so 80 bits in total), a particular sequence of fragments to be reassembled into the original packet. If the fourth halfword is zero (or has only the Don’t Fragment bit set), the IP ID field has no meaning. Hence the entire second word can be considered fragmentation related. I agree that it’s not a very robust mechanism; it breaks under extensive packet reordering at high packet rates (circumstances which are probably showing up in iperf tests against flow-isolating AQMs). It would be better not to have fragmentation at the IP layer at all. But it’s not as bad as you say; it does work for low packet rates, which is all it was intended for. Here’s my preferred reference diagram: https://nmap.org/book/tcpip-ref.html - Jonathan Morton ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 19:33 ` Jonathan Morton @ 2016-05-06 19:54 ` David Lang 2016-05-06 19:58 ` David Lang 0 siblings, 1 reply; 15+ messages in thread From: David Lang @ 2016-05-06 19:54 UTC (permalink / raw) To: Jonathan Morton; +Cc: Stephen Hemminger, cake [-- Attachment #1: Type: TEXT/PLAIN, Size: 2035 bytes --] On Fri, 6 May 2016, Jonathan Morton wrote: >> On 6 May, 2016, at 22:14, David Lang <david@lang.hm> wrote: >> >> On Fri, 6 May 2016, Jonathan Morton wrote: >> >>>> On 6 May, 2016, at 21:50, David Lang <david@lang.hm> wrote: >>>> >>>> what IP id are you referring to? I don't remember any such field in the packet header. >>> >>> It’s the third halfword. >> >> half a word is hardly enough to be unique across the Internet, anything that small would lead to lots of attackes that inserted garbage data into threads. > > It doesn’t need to be globally unique. It merely identifies, in conjunction with src/dst address pair (so 80 bits in total), a particular sequence of fragments to be reassembled into the original packet. If the fourth halfword is zero (or has only the Don’t Fragment bit set), the IP ID field has no meaning. Hence the entire second word can be considered fragmentation related. > > I agree that it’s not a very robust mechanism; it breaks under extensive packet reordering at high packet rates (circumstances which are probably showing up in iperf tests against flow-isolating AQMs). It would be better not to have fragmentation at the IP layer at all. But it’s not as bad as you say; it does work for low packet rates, which is all it was intended for. > > Here’s my preferred reference diagram: https://nmap.org/book/tcpip-ref.html rfc-6864 shows that this field is not used the way you think it is in practice (if it was, nobody would have been able to exceed 6.4Mbps) Given all the things that can cause fragmentation on virtually every packet (tunnels/vpns), and the fact that having this be unique would restrict all traffice between a given source and destination to 6.4Mbps, I am extremely doubtful that it is used the way that rfc-6864 suggests (after all it's a recent RFC, 2013) I know that I've looked at packet dumps that have shown fragmented data and seen the port numbers in the fragment headers. I'd bet that in practice firewalls/etc ignore the IP ID field. David Lang ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 19:54 ` David Lang @ 2016-05-06 19:58 ` David Lang 0 siblings, 0 replies; 15+ messages in thread From: David Lang @ 2016-05-06 19:58 UTC (permalink / raw) To: Jonathan Morton; +Cc: cake [-- Attachment #1: Type: TEXT/PLAIN, Size: 3055 bytes --] On Fri, 6 May 2016, David Lang wrote: > On Fri, 6 May 2016, Jonathan Morton wrote: > >>> On 6 May, 2016, at 22:14, David Lang <david@lang.hm> wrote: >>> >>> On Fri, 6 May 2016, Jonathan Morton wrote: >>> >>>>> On 6 May, 2016, at 21:50, David Lang <david@lang.hm> wrote: >>>>> >>>>> what IP id are you referring to? I don't remember any such field in the >>>>> packet header. >>>> >>>> It’s the third halfword. >>> >>> half a word is hardly enough to be unique across the Internet, anything >>> that small would lead to lots of attackes that inserted garbage data into >>> threads. >> >> It doesn’t need to be globally unique. It merely identifies, in >> conjunction with src/dst address pair (so 80 bits in total), a particular >> sequence of fragments to be reassembled into the original packet. If the >> fourth halfword is zero (or has only the Don’t Fragment bit set), the IP ID >> field has no meaning. Hence the entire second word can be considered >> fragmentation related. >> >> I agree that it’s not a very robust mechanism; it breaks under extensive >> packet reordering at high packet rates (circumstances which are probably >> showing up in iperf tests against flow-isolating AQMs). It would be better >> not to have fragmentation at the IP layer at all. But it’s not as bad as >> you say; it does work for low packet rates, which is all it was intended >> for. >> >> Here’s my preferred reference diagram: >> https://nmap.org/book/tcpip-ref.html > > rfc-6864 shows that this field is not used the way you think it is in > practice (if it was, nobody would have been able to exceed 6.4Mbps) > > Given all the things that can cause fragmentation on virtually every packet > (tunnels/vpns), and the fact that having this be unique would restrict all > traffice between a given source and destination to 6.4Mbps, I am extremely > doubtful that it is used the way that rfc-6864 suggests (after all it's a > recent RFC, 2013) > > I know that I've looked at packet dumps that have shown fragmented data and > seen the port numbers in the fragment headers. > > I'd bet that in practice firewalls/etc ignore the IP ID field. from rfc-6864 Many current devices support fragmentation that ignores the IPv4 Don't Fragment (DF) bit. Such devices already transit traffic from sources that reuse the ID. If fragments of different datagrams reusing the same ID (within the source address/destination address/protocol tuple) arrive at the destination interleaved, fragmentation would fail and traffic would be dropped. Either such interleaving is uncommon or traffic from such devices is not widely traversing these DF-ignoring devices, because significant occurrence of reassembly errors has not been reported. DF-ignoring devices do not comply with existing standards, and it is not feasible to update the standards to allow them as compliant. They ignore the possibility that the OS reassembly is doing something different than they are thinking. David Lang [-- Attachment #2: Type: TEXT/PLAIN, Size: 137 bytes --] _______________________________________________ Cake mailing list Cake@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cake ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 18:50 ` David Lang 2016-05-06 18:53 ` Jonathan Morton @ 2016-05-06 23:14 ` Benjamin Cronce 2016-05-07 2:09 ` David Lang 1 sibling, 1 reply; 15+ messages in thread From: Benjamin Cronce @ 2016-05-06 23:14 UTC (permalink / raw) To: David Lang; +Cc: Stephen Hemminger, cake [-- Attachment #1: Type: text/plain, Size: 2923 bytes --] The good ones do. You need to reassemble the packets if you want to enforce proper stateful TCP. I wonder how those new network stacks that use MSS to send packets directly to a specific core will handle fragments, since they need all packets for a flow to get assigned to the same core, which means L3/L4 must hash to the same value, and no L4 for later fragments. Unless all fragmented packets get handled on a specific core, like ICMP. On Fri, May 6, 2016 at 1:50 PM, David Lang <david@lang.hm> wrote: > On Fri, 6 May 2016, Stephen Hemminger wrote: > > On Fri, 6 May 2016 02:00:02 -0700 (PDT) >> David Lang <david@lang.hm> wrote: >> >> On Fri, 6 May 2016, moeller0 wrote: >>> >>> Hi Jonathan, >>>> >>>> On May 6, 2016, at 06:44 , Jonathan Morton <chromatix99@gmail.com> >>>>> wrote: >>>>> >>>>> >>>>> On 6 May, 2016, at 07:35, Dave Taht <dave.taht@gmail.com> wrote: >>>>>> >>>>>> this would be a pretty nifty feature for cake to have in this hostile >>>>>> universe. >>>>>> >>>>> >>>>> Yes, but difficult to implement since the trailing fragments lose the >>>>> proto/port information, and thus get sorted into a different queue than the >>>>> leading fragment. We would essentially need to implement the same tracking >>>>> mechanisms as for actual reassembly. >>>>> >>>> >>>> But the receiver needs to be able to re-segment the fragments >>>> so all required information needs to be there; what about looking at src >>>> and dst address and the MF flag in the header as well as the fragment >>>> offset and scrape proto/port from the leading fragment and “virtually” >>>> apply it to all following fragments, that way cake will do the right thing. >>>> All of this might be too costly in implementation and computation to be >>>> feasible… >>>> >>> >>> wait a minute here. If the fragments are going to go over the network as >>> separate packets, each fragment must include source/dest ip and >>> source/dest >>> port, otherwise the recipient isn't going to be able to figure out what >>> to do >>> with it. >>> >>> David Lang >>> >> >> Fragments are reassembled by IP id, not src/dest port. >> Only the first fragment has the L4 header with src/dest port, >> all the rest are just data. >> >> That is why most firewalls reassemble all packets (and then refragment as >> needed) >> to allow matching on port values. >> > > actually, many firewalls do not reassemble packets, they pass packets > through without reassembly. > > what IP id are you referring to? I don't remember any such field in the > packet header. > > David Lang > > > For several cases where flow information is necessary most code does: >> flowid = is_fragementd(ip) ? ip->id : hash(ip + tcp) >> >> > _______________________________________________ > Cake mailing list > Cake@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cake > > [-- Attachment #2: Type: text/html, Size: 4297 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood 2016-05-06 23:14 ` Benjamin Cronce @ 2016-05-07 2:09 ` David Lang 0 siblings, 0 replies; 15+ messages in thread From: David Lang @ 2016-05-07 2:09 UTC (permalink / raw) To: Benjamin Cronce; +Cc: Stephen Hemminger, cake [-- Attachment #1: Type: TEXT/PLAIN, Size: 3138 bytes --] On Fri, 6 May 2016, Benjamin Cronce wrote: > The good ones do. You need to reassemble the packets if you want to enforce > proper stateful TCP. I wonder how those new network stacks that use MSS to > send packets directly to a specific core will handle fragments, since they > need all packets for a flow to get assigned to the same core, which means > L3/L4 must hash to the same value, and no L4 for later fragments. Unless > all fragmented packets get handled on a specific core, like ICMP. I remember a big fuss 10 or so years ago with a bunch of firewall vulnerabilities where people could get creative with fragments and bypass the firewall rules. > On Fri, May 6, 2016 at 1:50 PM, David Lang <david@lang.hm> wrote: > >> On Fri, 6 May 2016, Stephen Hemminger wrote: >> >> On Fri, 6 May 2016 02:00:02 -0700 (PDT) >>> David Lang <david@lang.hm> wrote: >>> >>> On Fri, 6 May 2016, moeller0 wrote: >>>> >>>> Hi Jonathan, >>>>> >>>>> On May 6, 2016, at 06:44 , Jonathan Morton <chromatix99@gmail.com> >>>>>> wrote: >>>>>> >>>>>> >>>>>> On 6 May, 2016, at 07:35, Dave Taht <dave.taht@gmail.com> wrote: >>>>>>> >>>>>>> this would be a pretty nifty feature for cake to have in this hostile >>>>>>> universe. >>>>>>> >>>>>> >>>>>> Yes, but difficult to implement since the trailing fragments lose the >>>>>> proto/port information, and thus get sorted into a different queue than the >>>>>> leading fragment. We would essentially need to implement the same tracking >>>>>> mechanisms as for actual reassembly. >>>>>> >>>>> >>>>> But the receiver needs to be able to re-segment the fragments >>>>> so all required information needs to be there; what about looking at src >>>>> and dst address and the MF flag in the header as well as the fragment >>>>> offset and scrape proto/port from the leading fragment and “virtually” >>>>> apply it to all following fragments, that way cake will do the right thing. >>>>> All of this might be too costly in implementation and computation to be >>>>> feasible… >>>>> >>>> >>>> wait a minute here. If the fragments are going to go over the network as >>>> separate packets, each fragment must include source/dest ip and >>>> source/dest >>>> port, otherwise the recipient isn't going to be able to figure out what >>>> to do >>>> with it. >>>> >>>> David Lang >>>> >>> >>> Fragments are reassembled by IP id, not src/dest port. >>> Only the first fragment has the L4 header with src/dest port, >>> all the rest are just data. >>> >>> That is why most firewalls reassemble all packets (and then refragment as >>> needed) >>> to allow matching on port values. >>> >> >> actually, many firewalls do not reassemble packets, they pass packets >> through without reassembly. >> >> what IP id are you referring to? I don't remember any such field in the >> packet header. >> >> David Lang >> >> >> For several cases where flow information is necessary most code does: >>> flowid = is_fragementd(ip) ? ip->id : hash(ip + tcp) >>> >>> >> _______________________________________________ >> Cake mailing list >> Cake@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/cake >> >> > ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-05-07 2:09 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CAA93jw6QLyx9EaS+ntB0D3duoysu_Z-UYyQfHnRa=pfqPDfWOw@mail.gmail.com> [not found] ` <1462125592.5535.194.camel@edumazet-glaptop3.roam.corp.google.com> [not found] ` <865DA393-262D-40B6-A9D3-1B978CD5F6C6@gmail.com> [not found] ` <1462128385.5535.200.camel@edumazet-glaptop3.roam.corp.google.com> [not found] ` <C5D365DA-18EE-446E-9D25-41F48B1C583E@gmail.com> [not found] ` <1462136140.5535.219.camel@edumazet-glaptop3.roam.corp.google.com> [not found] ` <CACiydbKUu11=zWitkDha0ddgk1-G_Z4-e1+=9ky776VktF5HHg@mail.gmail.com> [not found] ` <1462201620.5535.250.camel@edumazet-glaptop3.roam.corp.google.com> [not found] ` <CACiydbKeKUENncrc-NmYRcku-DGVeGqqzYMqsCqKdxPsR7yUOQ@mail.gmail.com> [not found] ` <1462205669.5535.254.camel@edumazet-glaptop3.roam.corp.google.com> [not found] ` <CACiydbL26Jj3EcEL4EmqaH=1Dm-Q0dpVwoWxqUSZ7ry10bRgeg@mail.gmail.com> [not found] ` <CAA93jw5Y3DSzuOZo=S6_dsUqJvy_3ThNe6tMic2ZJ14kQPnFHg@mail.gmail.com> [not found] ` <CACiydb+kOLNBwEn+gDU3fZrXEQxp5FMFLH_mDS1ZO5J8r9yiBA@mail.gmail.com> [not found] ` <2D83E4F6-03DD-4421-AAE0-DD3C6A8AFCE0@gmail.com> [not found] ` <CAA93jw6Aj3Rcsm=Q=KZVrW_TGThVwu6pRAN3nNQ4tvSODY_zUg@mail.gmail.com> 2016-05-06 4:35 ` [Cake] Fwd: [Codel] fq_codel_drop vs a udp flood Dave Taht 2016-05-06 4:44 ` Jonathan Morton 2016-05-06 4:57 ` Dave Taht 2016-05-06 8:49 ` moeller0 2016-05-06 9:00 ` David Lang 2016-05-06 9:36 ` moeller0 2016-05-06 15:31 ` Stephen Hemminger 2016-05-06 18:50 ` David Lang 2016-05-06 18:53 ` Jonathan Morton 2016-05-06 19:14 ` David Lang 2016-05-06 19:33 ` Jonathan Morton 2016-05-06 19:54 ` David Lang 2016-05-06 19:58 ` David Lang 2016-05-06 23:14 ` Benjamin Cronce 2016-05-07 2:09 ` David Lang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox