Historic archive of defunct list bloat-devel@lists.bufferbloat.net
 help / color / mirror / Atom feed
* Hacking on the rtl8366S
@ 2011-06-03 12:28 Dave Taht
  2011-06-03 12:42 ` Juliusz Chroboczek
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Taht @ 2011-06-03 12:28 UTC (permalink / raw)
  To: Gabor Juhos, bloat-devel


[-- Attachment #1.1: Type: text/plain, Size: 2077 bytes --]

I sat down to play with the rtl366S
switch code (which is the one in the wndr3700)
late last night.  Attached is as far as I got, merely

A) some new register definitions

B) support for getting the ip/mac addr of the switch

(I didn't remember the functions for printing them out of the kernel last
night)

C) changing the the default buffer size in the switch to >9k.

I was pleased that I didn't blow anything up, and it compiled and ran the
first time, and for all I know, it did some good (and is already in the
cerowrt build at:

http://huchra.bufferbloat.net/~cerowrt/cerowrt-wndr3700/

Which, of course, includes SFB, and nearly every other debloating trick
we've come up with so far.
)


So I thought I would sit down and explain what I'm trying to accomplish:

0) By *increasing* the per-packet buffer size, I hope that the enormous
latencies (>100ms) in the switch I have been seeing, will reduce
significantly.

In other words, I'm starving the switch of ram it otherwise would use for
buffering.

(threads on this on the bloat lists, documentation on the bismark-testbed
wiki).

Gaining jumbo packet support out of it is just a bonus.

1) The switch has a mac address, which so far as I know, is unused. (the
switch is bridged to the wireless interfaces, normally)

I'd like to be ROUTING packets between wired and wireless, rather than
switching them - at least for now - because it's really hard to make sense
of some of the packet traces I've been seeing without separating the media
they've been running on.

2) *Really* want port mirroring to work - similar reasons to point 1, above.


3)  Interestingly, there is QoS on the switch, which supports diffserv. (as
well as vlan or port based prioritization)

4) Various forms of flow control should be enable-able

5) With better per-port link state info, it might be possible to throttle or
balance flows across those ports better.

6) There's probably other stuff worth doing

Open questions:

How to expose a QoS API to userspace?
Similarly, port mirroring...
How is the snmp mib interface supposed to work?

[-- Attachment #1.2: Type: text/html, Size: 2370 bytes --]

[-- Attachment #2: rtl8366_mac.patch --]
[-- Type: text/x-diff, Size: 4494 bytes --]

diff --git a/target/linux/generic/files/drivers/net/phy/rtl8366s.c b/target/linux/generic/files/drivers/net/phy/rtl8366s.c
index 3f3d6f6..972e9a2 100644
--- a/target/linux/generic/files/drivers/net/phy/rtl8366s.c
+++ b/target/linux/generic/files/drivers/net/phy/rtl8366s.c
@@ -20,7 +20,7 @@
 #include "rtl8366_smi.h"
 
 #define RTL8366S_DRIVER_DESC	"Realtek RTL8366S ethernet switch driver"
-#define RTL8366S_DRIVER_VER	"0.2.2"
+#define RTL8366S_DRIVER_VER	"0.2.3"
 
 #define RTL8366S_PHY_NO_MAX	4
 #define RTL8366S_PHY_PAGE_MAX	7
@@ -46,6 +46,57 @@
 #define RTL8366S_SSCR2				0x0004
 #define RTL8366S_SSCR2_DROP_UNKNOWN_DA		BIT(0)
 
+/* Port Mirroring */
+
+/* The RTL8366/RTL8369 supports one set of port mirroring 
+   functions for all 6/9 ports. User could
+   monitor both the TX and RX packets of the source port from a mirror port. 
+   The source port to be
+   mirrored can be selected in SOURCE_PORT[3:0] in Register PMCR (0x0007). 
+   The monitor port can be
+   selected in MONITOR_PORT[3:0] in Register PMCR (0x0007).
+   MIRROR_TX and MIRROR_RX in Register PMCR (0x0007) are used to select 
+   the TX or RX packets
+   of the source port to be mirrored. 
+   If MIRROR_ISO in Register PMCR (0x0007) is enabled, the monitor
+   port only forwards the TX or RX packets of the source port. 
+   Any other packets destined for the monitor port will be dropped. 
+   When MIRROR_SPC in Register PMCR (0x0007) is enabled,
+   Pause packets received by the source port will be forwarded to the 
+   monitor port.
+
+
+*/
+
+#define RTL8366S_PMCR 				0x0007
+
+#define RTL8366S_MIRROR_SOURCE_PORT 0
+#define RTL8366S_MIRROR_MONITOR_PORT 4
+#define RTL8366S_MIRROR_RX BIT(8)
+#define RTL8366S_MIRROR_TX BIT(9)
+#define RTL8366S_MIRROR_ISO BIT(11)
+#define RTL8366S_MIRROR_SPC BIT(10)
+
+/* QoS Control Registers */
+#define RTL8366S_QCR0				0x0009
+
+/* QCR1 controls port based priority */
+#define RTL8366S_QCR1				0x000A
+
+/* QCR2-5 are there for diffserv */
+#define RTL8366S_QCR2				0x000B
+#define RTL8366S_QCR3				0x000C
+#define RTL8366S_QCR4				0x000D
+#define RTL8366S_QCR5				0x000E
+
+#define RTL8366S_EN_QOS 			BIT(0)
+#define RTL8366S_EN_PORT_PRI 			BIT(2)
+#define RTL8366S_EN_DS_PRI 			BIT(3)
+#define RTL8366S_EN_1Q_PRI			BIT(4)
+
+/* bits 7&6 control the weight */
+#define RTL8366S_QUEUE_WEIGHT 7 
+
 #define RTL8366S_RESET_CTRL_REG			0x0100
 #define RTL8366S_CHIP_CTRL_RESET_HW		1
 #define RTL8366S_CHIP_CTRL_RESET_SW		(1 << 1)
@@ -55,6 +106,15 @@
 #define RTL8366S_CHIP_ID_REG			0x0105
 #define RTL8366S_CHIP_ID_8366			0x8366
 
+/* Switch MAC And IP Addr */
+
+#define RTL8366S_SMAR0				0x0046
+#define RTL8366S_SMAR1				0x0047
+#define RTL8366S_SMAR2				0x0048
+
+#define RTL8366S_SW_IP0				0x004D
+#define RTL8366S_SW_IP1			0x004E
+
 /* PHY registers control */
 #define RTL8366S_PHY_ACCESS_CTRL_REG		0x8028
 #define RTL8366S_PHY_ACCESS_DATA_REG		0x8029
@@ -263,9 +323,9 @@ static int rtl8366s_hw_init(struct rtl8366_smi *smi)
 			       pdata->initvals[i].val);
 	}
 
-	/* set maximum packet length to 1536 bytes */
+	/* set maximum packet length to 16000 bytes */
 	REG_RMW(smi, RTL8366S_SGCR, RTL8366S_SGCR_MAX_LENGTH_MASK,
-		RTL8366S_SGCR_MAX_LENGTH_1536);
+		RTL8366S_SGCR_MAX_LENGTH_16000);
 
 	/* enable learning for all ports */
 	REG_WR(smi, RTL8366S_SSCR0, 0);
@@ -690,6 +750,21 @@ static int rtl8366s_sw_set_learning_enable(struct switch_dev *dev,
 }
 
 
+static int rtl_8366s_get_ip(struct rtl8366_smi *smi) {
+  u32 a, b;
+  rtl8366_smi_read_reg(smi,RTL8366S_SW_IP0,&a);
+  rtl8366_smi_read_reg(smi,RTL8366S_SW_IP1,&b);
+  return(a | b << 16);
+}
+
+static u64 rtl_8366s_get_mac(struct rtl8366_smi *smi) {
+  u64 a, b, c;
+  rtl8366_smi_read_reg(smi,RTL8366S_SMAR0,&a);
+  rtl8366_smi_read_reg(smi,RTL8366S_SMAR1,&b);
+  rtl8366_smi_read_reg(smi,RTL8366S_SMAR2,&c);
+  return(a | b << 16 | c << 32);
+}
+
 static const char *rtl8366s_speed_str(unsigned speed)
 {
 	switch (speed) {
@@ -1001,6 +1076,9 @@ static int rtl8366s_detect(struct rtl8366_smi *smi)
 {
 	u32 chip_id = 0;
 	u32 chip_ver = 0;
+	u32 ip = 0;
+	u64 mac = 0;
+
 	int ret;
 
 	ret = rtl8366_smi_read_reg(smi, RTL8366S_CHIP_ID_REG, &chip_id);
@@ -1027,6 +1105,11 @@ static int rtl8366s_detect(struct rtl8366_smi *smi)
 	dev_info(smi->parent, "RTL%04x ver. %u chip found\n",
 		 chip_id, chip_ver & RTL8366S_CHIP_VERSION_MASK);
 
+	ip = rtl_8366s_get_ip(smi);
+	mac = rtl_8366s_get_mac(smi);
+
+	dev_info(smi->parent, "Switch Mac: %ld, Switch IP: %d\n", mac, ip);
+
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Hacking on the rtl8366S
  2011-06-03 12:28 Hacking on the rtl8366S Dave Taht
@ 2011-06-03 12:42 ` Juliusz Chroboczek
  2011-06-03 14:32   ` Dave Taht
  0 siblings, 1 reply; 5+ messages in thread
From: Juliusz Chroboczek @ 2011-06-03 12:42 UTC (permalink / raw)
  To: Dave Taht; +Cc: Gabor Juhos, bloat-devel

> (the switch is bridged to the wireless interfaces, normally)

Are you sure about that?  The usual configuration is to use a hardware
switch between the wired ports, but bridge the wired and wireless ports
in software.  Can you post the output of brctl show?

At any rate, you should be able to program the switch to put each port
on a different vlan -- that's how the separation between LAN and WAN
ports is usually implemented.

-- Juliusz



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Hacking on the rtl8366S
  2011-06-03 12:42 ` Juliusz Chroboczek
@ 2011-06-03 14:32   ` Dave Taht
  2011-06-04  3:49     ` Dave Taht
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Taht @ 2011-06-03 14:32 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: Gabor Juhos, bloat-devel

[-- Attachment #1: Type: text/plain, Size: 3512 bytes --]

On Fri, Jun 3, 2011 at 6:42 AM, Juliusz Chroboczek <jch@pps.jussieu.fr>wrote:

> > (the switch is bridged to the wireless interfaces, normally)
>
> Are you sure about that?


Pretty sure. The mac addr obtained for the bridge appears to be derived from
the wireless chip. When I tried to break apart the wired and wireless
devices completely in my testing last week, I was unable to get the wired
interface to work at all without disabling the wireless, due to the lack of
a distinct mac for it (or so I thought)



> The usual configuration is to use a hardware
> switch between the wired ports, but bridge the wired and wireless ports
> in software.  Can you post the output of brctl show?
>
>
>
This is from last nights cerowrt build...

root@cero1:~# brctl show
bridge name    bridge id        STP enabled    interfaces
br-lan        8000.c43dc7a37679    no        eth0.1
                            wlan0
                            wlan3

And the mac addr for eth0 is the same as wlan0

root@cero1:~# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr C4:3D:C7:A3:76:79
          inet6 addr: fe80::c63d:c7ff:fea3:7679/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3420 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:16
          RX bytes:0 (0.0 B)  TX bytes:460327 (449.5 KiB)
          Interrupt:4

root@cero1:~# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr C4:3D:C7:A3:76:7A
          inet addr:192.168.1.110  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::c63d:c7ff:fea3:767a/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:118658 errors:0 dropped:0 overruns:0 frame:0
          TX packets:62344 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:5
          RX bytes:153610689 (146.4 MiB)  TX bytes:5861647 (5.5 MiB)
          Interrupt:5

root@cero1:~# ifconfig wlan0
wlan0     Link encap:Ethernet  HWaddr C4:3D:C7:A3:76:79
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3413 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:4
          RX bytes:0 (0.0 B)  TX bytes:506686 (494.8 KiB)



> At any rate, you should be able to program the switch to put each port
> on a different vlan -- that's how the separation between LAN and WAN
> ports is usually implemented.
>

Although an interesting idea, I wasn't planning to route, at this point,
each individual wired port - just break apart the wired and wireless
interfaces enough to look at and optimize their behavior better.

The external interface (to the internet) runs through the switch (on a
dedicated port) and has it's own phy, so far as I can tell.

The internal (to-the-switch) interface is just borrowing the wireless mac,
so far as I can tell, at present. That's basically all the wifi setup script
does.

There's a wiring diagram that more or less explains these oddities on pages
16 and 17 of:

rtl8366_8369_datasheet_1-1.pdf

which appears to be the most comprehensive document on this chipset series.
There is a mildly better diagram on the 1.4 data sheet specific to the
8366S.

-- Juliusz
>
>
>


-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

[-- Attachment #2: Type: text/html, Size: 4565 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Hacking on the rtl8366S
  2011-06-03 14:32   ` Dave Taht
@ 2011-06-04  3:49     ` Dave Taht
  2011-06-04  3:54       ` Dave Taht
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Taht @ 2011-06-04  3:49 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: Gabor Juhos, bloat-devel

[-- Attachment #1: Type: text/plain, Size: 3351 bytes --]

So I built a fresh build of openwrt from git head, patched in my preliminary
patch to the switch, compiled it with oprofile support, and turned on -O2
rather than -Os, in my usual approach to all-up testing that makes it hard
to track down what's really going on without backtracking.

The net result with iperf was better than before,
with iptables and qos turned entirely off, no nat, etc.
pings stayed very flat and well below 8ms, down from 100ms.

elara: [  3]  0.0-60.0 sec   435 MBytes  60.7 Mbits/sec
io: [  3]  0.0-60.0 sec   495 MBytes  69.2 Mbits/sec
leda: [  3]  0.0-60.9 sec   226 MBytes  31.1 Mbits/sec
thebe: [  3]  0.0-60.0 sec   105 MBytes  14.7 Mbits/sec

(how all this stuff is routed is beyond the scope of this, but only
partially explains the discrepancies above)

Oprofiling a later run, we're spending a lot of time in an unaligned trap,
and I have no idea why iptables and conntrack even register.

I am not sure at the moment how to track down the source of the alignment
fault, I assume I can look at the performance counter somehow, and backtrack
that to a given function or functions. It's late...

samples  cum. samples  %        cum. %     app name                 symbol
name
14992    14992         22.7158  22.7158    vmlinux                  do_ade
10054    25046         15.2338  37.9496    ip_tables
/ip_tables
7257     32303         10.9958  48.9454    nf_conntrack
/nf_conntrack
3398     35701          5.1486  54.0941    vmlinux
handle_adel_int
2613     38314          3.9592  58.0533    vmlinux                  ip_rcv
2356     40670          3.5698  61.6231    vmlinux
nf_iterate
1869     42539          2.8319  64.4550    iptable_nat
/iptable_nat
1626     44165          2.4637  66.9187    vmlinux
r4k_dma_cache_inv
1544     45709          2.3395  69.2582    nf_conntrack_ipv4
/nf_conntrack_ipv4
1331     47040          2.0167  71.2749    vmlinux
ag71xx_poll
1011     48051          1.5319  72.8068    vmlinux
ip_route_input_common
988      49039          1.4970  74.3038    vmlinux
ret_from_exception



asmlinkage void do_ade(struct pt_regs *regs)
{
        unsigned int __user *pc;
        mm_segment_t seg;

        perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS,
                        1, 0, regs, regs->cp0_badvaddr);
        /*
         * Did we catch a fault trying to load an instruction?
         * Or are we running in MIPS16 mode?
         */
        if ((regs->cp0_badvaddr == regs->cp0_epc) || (regs->cp0_epc & 0x1))
                goto sigbus;

        pc = (unsigned int __user *) exception_epc(regs);
        if (user_mode(regs) && !test_thread_flag(TIF_FIXADE))
                goto sigbus;
        if (unaligned_action == UNALIGNED_ACTION_SIGNAL)
                goto sigbus;
        else if (unaligned_action == UNALIGNED_ACTION_SHOW)
                show_registers(regs);

        /*
         * Do branch emulation only if we didn't forward the exception.
         * This is all so but ugly ...
         */
        seg = get_fs();
        if (!user_mode(regs))
                set_fs(KERNEL_DS);
        emulate_load_store_insn(regs, (void __user *)regs->cp0_badvaddr,
pc);
        set_fs(seg);

        return;


-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

[-- Attachment #2: Type: text/html, Size: 3760 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Hacking on the rtl8366S
  2011-06-04  3:49     ` Dave Taht
@ 2011-06-04  3:54       ` Dave Taht
  0 siblings, 0 replies; 5+ messages in thread
From: Dave Taht @ 2011-06-04  3:54 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: Gabor Juhos, bloat-devel

[-- Attachment #1: Type: text/plain, Size: 3883 bytes --]

If anyone cares, a copy of this build is at:

http://huchra.bufferbloat.net/~cerowrt/cerowrt-wndr3700-dbg/

and the kernel is at:

http://huchra.bufferbloat.net/~cerowrt/vmlinux



On Fri, Jun 3, 2011 at 9:49 PM, Dave Taht <dave.taht@gmail.com> wrote:

> So I built a fresh build of openwrt from git head, patched in my
> preliminary patch to the switch, compiled it with oprofile support, and
> turned on -O2 rather than -Os, in my usual approach to all-up testing that
> makes it hard to track down what's really going on without backtracking.
>
> The net result with iperf was better than before,
> with iptables and qos turned entirely off, no nat, etc.
> pings stayed very flat and well below 8ms, down from 100ms.
>
> elara: [  3]  0.0-60.0 sec   435 MBytes  60.7 Mbits/sec
> io: [  3]  0.0-60.0 sec   495 MBytes  69.2 Mbits/sec
> leda: [  3]  0.0-60.9 sec   226 MBytes  31.1 Mbits/sec
> thebe: [  3]  0.0-60.0 sec   105 MBytes  14.7 Mbits/sec
>
> (how all this stuff is routed is beyond the scope of this, but only
> partially explains the discrepancies above)
>
> Oprofiling a later run, we're spending a lot of time in an unaligned trap,
> and I have no idea why iptables and conntrack even register.
>
> I am not sure at the moment how to track down the source of the alignment
> fault, I assume I can look at the performance counter somehow, and backtrack
> that to a given function or functions. It's late...
>
> samples  cum. samples  %        cum. %     app name                 symbol
> name
> 14992    14992         22.7158  22.7158    vmlinux                  do_ade
> 10054    25046         15.2338  37.9496    ip_tables
> /ip_tables
> 7257     32303         10.9958  48.9454    nf_conntrack
> /nf_conntrack
> 3398     35701          5.1486  54.0941    vmlinux
> handle_adel_int
> 2613     38314          3.9592  58.0533    vmlinux                  ip_rcv
> 2356     40670          3.5698  61.6231    vmlinux
> nf_iterate
> 1869     42539          2.8319  64.4550    iptable_nat
> /iptable_nat
> 1626     44165          2.4637  66.9187    vmlinux
> r4k_dma_cache_inv
> 1544     45709          2.3395  69.2582    nf_conntrack_ipv4
> /nf_conntrack_ipv4
> 1331     47040          2.0167  71.2749    vmlinux
> ag71xx_poll
> 1011     48051          1.5319  72.8068    vmlinux
> ip_route_input_common
> 988      49039          1.4970  74.3038    vmlinux
> ret_from_exception
>
>
>
> asmlinkage void do_ade(struct pt_regs *regs)
> {
>         unsigned int __user *pc;
>         mm_segment_t seg;
>
>         perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS,
>                         1, 0, regs, regs->cp0_badvaddr);
>         /*
>          * Did we catch a fault trying to load an instruction?
>          * Or are we running in MIPS16 mode?
>          */
>         if ((regs->cp0_badvaddr == regs->cp0_epc) || (regs->cp0_epc & 0x1))
>                 goto sigbus;
>
>         pc = (unsigned int __user *) exception_epc(regs);
>         if (user_mode(regs) && !test_thread_flag(TIF_FIXADE))
>                 goto sigbus;
>         if (unaligned_action == UNALIGNED_ACTION_SIGNAL)
>                 goto sigbus;
>         else if (unaligned_action == UNALIGNED_ACTION_SHOW)
>                 show_registers(regs);
>
>         /*
>          * Do branch emulation only if we didn't forward the exception.
>          * This is all so but ugly ...
>          */
>         seg = get_fs();
>         if (!user_mode(regs))
>                 set_fs(KERNEL_DS);
>         emulate_load_store_insn(regs, (void __user *)regs->cp0_badvaddr,
> pc);
>         set_fs(seg);
>
>         return;
>
>
>
> --
> Dave Täht
> SKYPE: davetaht
> US Tel: 1-239-829-5608
> http://the-edge.blogspot.com
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

[-- Attachment #2: Type: text/html, Size: 4847 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-06-04  3:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-03 12:28 Hacking on the rtl8366S Dave Taht
2011-06-03 12:42 ` Juliusz Chroboczek
2011-06-03 14:32   ` Dave Taht
2011-06-04  3:49     ` Dave Taht
2011-06-04  3:54       ` Dave Taht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox