[Make-wifi-fast] a bit of profiling on the archer

Dave Taht dave.taht at gmail.com
Sat Nov 19 12:33:51 EST 2016


On Thu, Nov 17, 2016 at 11:55 PM, Jesper Dangaard Brouer
<brouer at redhat.com> wrote:
>
> On Thu, 17 Nov 2016 20:14:49 -0800 Dave Taht <dave.taht at gmail.com> wrote:
>
>> I have not been profiling much on lower end platforms (it's hard, you
>> can crash a box pretty easily with the wrong options or sample rates).
>
> I'm happy to hear that perf does work on this lower end HW, although
> the disclaimer of sample rates.
>
> Does anyone know if hardware based PMU (Performance Monitor Units)
> exists for these kind of devices?
>
>> While watching the ath10k peak at 150-200mbits, at 99% of cpu in
>> softirq, I spent a bit of time profiling various counters and
>> statistics.
>>
>> for this one (while downloading 12 flows at the same time via flent)
>>
>> perf record -F 99 -e cpu-clock -ag -- sleep 10
>> perf report
>
> Below perf report is not well suite for email, could you instead
> provide output from command below:
>
> perf report --no-children --stdio --call-graph none

Thanks! That is way more readable output. Appended at the end of this post.

>
>>
>>   [[31m  67.81%[[m     0.00%  ksoftirqd/0  [kernel.kallsyms]       [k]
>> run_ksoftirqd
>>             |
>>             ---run_ksoftirqd
>>                |
>>                |[[31m--67.61%-- [[m __do_softirq
>>                |          |
>>                |          |[[31m--66.80%-- [[m net_rx_action
>>                |          |          |
>>                |          |          |[[31m--41.07%-- [[m ag71xx_poll
>>
>> ...
>
> (Looks like you managed to copy-paste the terminal escape codes for
> colors)

Not sure what terminal emulation perf is expecting. TERM is set to XTERM.

>>
>> It appears we're spending 66% of the time in the *ethernet* portion of
>> the path.
>
> Be careful you don't fool yourself. In your output you have the
> "children" mode on, so everything being called "under" net_rx_action is
> summed up.  It could be it goes all the way through to the wifi TX
> parts and that is part of the sum.  Even the memory allocations gets
> summed into this 66% number.
>
>
>> I'm going to stop worrying so much about the performance of the new
>> wifi algorithms.

for at least 24 hours

# To display the perf.data header info, please use
--header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 990  of event 'cpu-clock'
# Event count (approx.): 9999999900
#
# Overhead  Command      Shared Object           Symbol
# ........  ...........  ......................
.........................................
#
     4.34%  ksoftirqd/0  [ebtables]              [k] ebt_do_table
     2.93%  ksoftirqd/0  [kernel.kallsyms]       [k]
__netif_receive_skb_core
     2.83%  ksoftirqd/0  [ip_tables]             [k] ipt_do_table
     2.53%  ksoftirqd/0  [kernel.kallsyms]       [k] nf_iterate
     2.42%  ksoftirqd/0  [mac80211]              [k]
ieee80211_tx_dequeue
     2.02%  ksoftirqd/0  [kernel.kallsyms]       [k]
__local_bh_enable_ip
     2.02%  ksoftirqd/0  [mac80211]              [k]
__ieee80211_subif_start_xmit
     1.82%  ksoftirqd/0  [kernel.kallsyms]       [k] __dev_queue_xmit
     1.72%  ksoftirqd/0  [nf_conntrack]          [k] tcp_packet
     1.52%  ksoftirqd/0  [mac80211]              [k]
ieee80211_prepare_and_rx_handle
     1.52%  ksoftirqd/0  [mac80211]              [k]
ieee80211_rx_handlers
     1.41%  ksoftirqd/0  [kernel.kallsyms]       [k] br_handle_frame
     1.41%  ksoftirqd/0  [kernel.kallsyms]       [k] ip_finish_output2
     1.41%  ksoftirqd/0  [kernel.kallsyms]       [k] ip_forward
     1.41%  ksoftirqd/0  [nf_conntrack]          [k]
__nf_conntrack_find_get
     1.31%  ksoftirqd/0  [kernel.kallsyms]       [k] format_decode
     1.31%  ksoftirqd/0  [mac80211]              [k]
ieee80211_xmit_fast_finish
     1.21%  ksoftirqd/0  [mac80211]              [k]
ieee80211_tx_h_select_key
     1.11%  ksoftirqd/0  [kernel.kallsyms]       [k]
__copy_user_common
     1.11%  ksoftirqd/0  [mac80211]              [k]
ieee80211_queue_skb
     1.01%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_htt_txrx_compl_task
     1.01%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_mac_tx.isra.35
     1.01%  ksoftirqd/0  [kernel.kallsyms]       [k]
dev_hard_start_xmit
     1.01%  ksoftirqd/0  [kernel.kallsyms]       [k] eth_type_trans
     1.01%  ksoftirqd/0  [kernel.kallsyms]       [k]
netif_skb_features
     1.01%  ksoftirqd/0  [nf_conntrack_rtcache]  [k]
nf_rtcache_in.part.2
     0.91%  ksoftirqd/0  [ath10k_pci]            [k]
ath10k_pci_hif_tx_sg
     0.91%  ksoftirqd/0  [kernel.kallsyms]       [k] ip_rcv
     0.91%  ksoftirqd/0  [kernel.kallsyms]       [k] memcmp
     0.91%  ksoftirqd/0  [mac80211]              [k] ieee80211_rx_napi
     0.81%  ksoftirqd/0  [ath10k_core]           [k] ath10k_dbg
     0.81%  ksoftirqd/0  [ath10k_core]           [k] ath10k_htt_tx
     0.81%  ksoftirqd/0  [ath10k_pci]            [k]
ath10k_bus_pci_write32
     0.81%  ksoftirqd/0  [br_netfilter]          [k] br_nf_pre_routing
     0.81%  ksoftirqd/0  [ebtable_broute]        [k] ebt_broute
     0.81%  ksoftirqd/0  [kernel.kallsyms]       [k] ag71xx_poll
     0.81%  ksoftirqd/0  [kernel.kallsyms]       [k] ip_output
     0.81%  ksoftirqd/0  [kernel.kallsyms]       [k]
skb_get_hash_perturb
     0.81%  ksoftirqd/0  [mac80211]              [k]
sta_info_hash_lookup
     0.81%  ksoftirqd/0  [nf_conntrack_ipv4]     [k] ipv4_helper
     0.81%  kworker/0:1  [kernel.kallsyms]       [k] __delay
     0.71%  ksoftirqd/0  [kernel.kallsyms]       [k] __bzero
     0.71%  ksoftirqd/0  [kernel.kallsyms]       [k] __rmemcpy
     0.71%  ksoftirqd/0  [kernel.kallsyms]       [k]
_find_next_bit.part.0
     0.71%  ksoftirqd/0  [kernel.kallsyms]       [k] ip_rcv_finish
     0.71%  ksoftirqd/0  [mac80211]              [k]
ieee80211_lookup_ra_sta
     0.71%  ksoftirqd/0  [nf_conntrack]          [k]
hash_conntrack_raw
     0.71%  ksoftirqd/0  [nf_conntrack]          [k] nf_conntrack_in
     0.71%  ksoftirqd/0  [nf_conntrack]          [k] tcp_error
     0.61%  ksoftirqd/0  [br_netfilter]          [k]
br_nf_post_routing
     0.61%  ksoftirqd/0  [cfg80211]              [k]
cfg80211_classify8021d
     0.61%  ksoftirqd/0  [iptable_raw]           [k] iptable_raw_hook
     0.61%  ksoftirqd/0  [kernel.kallsyms]       [k]
br_handle_frame_finish
     0.61%  ksoftirqd/0  [kernel.kallsyms]       [k] netdev_pick_tx
     0.61%  ksoftirqd/0  [kernel.kallsyms]       [k]
r4k_dma_cache_wback_inv
     0.61%  ksoftirqd/0  [kernel.kallsyms]       [k]
skb_network_protocol
     0.61%  ksoftirqd/0  [nf_conntrack_rtcache]  [k]
nf_ct_rtcache_find_usable
     0.61%  ksoftirqd/0  [nf_nat_ipv4]           [k] nf_nat_ipv4_fn
     0.51%  ksoftirqd/0  [ath10k_pci]            [k]
ath10k_ce_send_nolock
     0.51%  ksoftirqd/0  [kernel.kallsyms]       [k]
__skb_flow_get_ports
     0.51%  ksoftirqd/0  [kernel.kallsyms]       [k] atomic64_add
     0.51%  ksoftirqd/0  [kernel.kallsyms]       [k]
br_dev_queue_push_xmit
     0.51%  ksoftirqd/0  [kernel.kallsyms]       [k] br_fdb_update
     0.51%  ksoftirqd/0  [kernel.kallsyms]       [k] dst_release
     0.51%  ksoftirqd/0  [kernel.kallsyms]       [k]
idr_get_empty_slot
     0.51%  ksoftirqd/0  [kernel.kallsyms]       [k] r4k_dma_cache_inv
     0.51%  ksoftirqd/0  [nf_conntrack]          [k]
nf_ct_deliver_cached_events
     0.51%  ksoftirqd/0  [nf_conntrack_ipv4]     [k] ipv4_conntrack_in
     0.51%  ksoftirqd/0  [nf_nat]                [k] nf_nat_packet
     0.40%  ksoftirqd/0  [ath10k_core]           [k] ath10k_dbg_dump
     0.40%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_mac_op_wake_tx_queue
     0.40%  ksoftirqd/0  [ath10k_pci]            [k]
ath10k_bus_pci_read32
     0.40%  ksoftirqd/0  [ath10k_pci]            [k]
ath10k_ce_completed_send_next_nolock
     0.40%  ksoftirqd/0  [br_netfilter]          [k] ip_sabotage_in
     0.40%  ksoftirqd/0  [kernel.kallsyms]       [k] __br_fdb_get
     0.40%  ksoftirqd/0  [kernel.kallsyms]       [k] __delay
     0.40%  ksoftirqd/0  [kernel.kallsyms]       [k]
__skb_flow_dissect
     0.40%  ksoftirqd/0  [kernel.kallsyms]       [k]
__wake_up_sync_key
     0.40%  ksoftirqd/0  [kernel.kallsyms]       [k] memcpy
     0.40%  ksoftirqd/0  [kernel.kallsyms]       [k] sch_direct_xmit
     0.40%  ksoftirqd/0  [kernel.kallsyms]       [k] skb_release_data
     0.40%  ksoftirqd/0  [kernel.kallsyms]       [k]
validate_xmit_skb.isra.30.part.31
     0.40%  ksoftirqd/0  [mac80211]              [k]
fq_flow_classify.constprop.17
     0.40%  ksoftirqd/0  [mac80211]              [k]
ieee80211_tx_status
     0.40%  ksoftirqd/0  [nf_conntrack]          [k]
__nf_ct_refresh_acct
     0.40%  ksoftirqd/0  [nf_conntrack_ipv4]     [k] ipv4_get_l4proto
     0.40%  ksoftirqd/0  [nf_conntrack_rtcache]  [k]
nf_rtcache_forward
     0.40%  ksoftirqd/0  [nf_conntrack_rtcache]  [k]
nf_rtcache_forward4
     0.40%  ksoftirqd/0  [nf_nat_ipv4]           [k] nf_nat_ipv4_in
     0.40%  ksoftirqd/0  [nf_nat_ipv4]           [k] nf_nat_ipv4_out
     0.30%  ksoftirqd/0  [ath10k_core]           [k]
__ath10k_htt_rx_ring_fill_n
     0.30%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_htt_rx_h_deliver
     0.30%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_htt_rx_h_mpdu
     0.30%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_mac_tx_push_txq
     0.30%  ksoftirqd/0  [ath10k_pci]            [k]
ath10k_ce_completed_send_next
     0.30%  ksoftirqd/0  [ath10k_pci]            [k]
ath10k_pci_write32
     0.30%  ksoftirqd/0  [br_netfilter]          [k] br_nf_dev_xmit
     0.30%  ksoftirqd/0  [br_netfilter]          [k]
nf_bridge_encap_header_len
     0.30%  ksoftirqd/0  [cfg80211]              [k] ieee80211_hdrlen
     0.30%  ksoftirqd/0  [kernel.kallsyms]       [k] __free_page_frag
     0.30%  ksoftirqd/0  [kernel.kallsyms]       [k]
ag71xx_hard_start_xmit
     0.30%  ksoftirqd/0  [kernel.kallsyms]       [k] br_dev_xmit
     0.30%  ksoftirqd/0  [kernel.kallsyms]       [k] br_pass_frame_up
     0.30%  ksoftirqd/0  [kernel.kallsyms]       [k] build_skb
     0.30%  ksoftirqd/0  [kernel.kallsyms]       [k] dev_gro_receive
     0.30%  ksoftirqd/0  [kernel.kallsyms]       [k] idr_alloc
     0.30%  ksoftirqd/0  [kernel.kallsyms]       [k] inet_gro_receive
     0.30%  ksoftirqd/0  [kernel.kallsyms]       [k] ip_finish_output
     0.30%  ksoftirqd/0  [kernel.kallsyms]       [k] nf_hook_slow
     0.30%  ksoftirqd/0  [kernel.kallsyms]       [k] tcp_gro_receive
     0.30%  ksoftirqd/0  [mac80211]              [k]
codel_dequeue_func
     0.30%  ksoftirqd/0  [mac80211]              [k] fq_flow_dequeue
     0.30%  ksoftirqd/0  [mac80211]              [k]
ieee80211_drop_unencrypted.part.3
     0.30%  ksoftirqd/0  [mac80211]              [k]
ieee80211_get_bssid
     0.30%  ksoftirqd/0  [nf_conntrack_ipv4]     [k] ipv4_confirm
     0.30%  ksoftirqd/0  [nf_defrag_ipv4]        [k]
ipv4_conntrack_defrag
     0.20%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_txrx_tx_unref
     0.20%  ksoftirqd/0  [ath10k_pci]            [k]
ath10k_pci_htt_rx_cb
     0.20%  ksoftirqd/0  [br_netfilter]          [k] br_nf_local_in
     0.20%  ksoftirqd/0  [ebtable_filter]        [k] ebt_out_hook
     0.20%  ksoftirqd/0  [ebtable_nat]           [k] ebt_nat_in
     0.20%  ksoftirqd/0  [ebtable_nat]           [k] ebt_nat_out
     0.20%  ksoftirqd/0  [iptable_mangle]        [k]
iptable_mangle_hook
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k] __dma_sync
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k] __free_pages_ok
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k]
__netif_receive_skb
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k]
__slab_alloc.isra.13.constprop.17
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k]
__slab_free.isra.14
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k] br_deliver
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k]
finish_task_switch
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k] ipv4_dst_check
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k]
is_skb_forwardable
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k] kmem_cache_free
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k] ktime_get
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k] net_rx_action
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k]
skb_release_head_state
     0.20%  ksoftirqd/0  [kernel.kallsyms]       [k] vsnprintf
     0.20%  ksoftirqd/0  [mac80211]              [k]
ieee80211_deliver_skb
     0.20%  ksoftirqd/0  [mac80211]              [k]
ieee80211_select_queue
     0.20%  ksoftirqd/0  [mac80211]              [k]
ieee80211_skb_resize
     0.20%  ksoftirqd/0  [nf_conntrack_rtcache]  [k] nf_rtcache_in4
     0.10%  hostapd      [kernel.kallsyms]       [k]
__copy_user_common
     0.10%  hostapd      [kernel.kallsyms]       [k] core_sys_select
     0.10%  hostapd      [kernel.kallsyms]       [k] datagram_poll
     0.10%  hostapd      [kernel.kallsyms]       [k] do_select
     0.10%  hostapd      [kernel.kallsyms]       [k]
finish_task_switch
     0.10%  hostapd      [kernel.kallsyms]       [k] handle_sys
     0.10%  hostapd      [kernel.kallsyms]       [k]
hrtimer_try_to_cancel
     0.10%  hostapd      [kernel.kallsyms]       [k] netlink_recvmsg
     0.10%  hostapd      [kernel.kallsyms]       [k] poll_initwait
     0.10%  hostapd      [kernel.kallsyms]       [k]
schedule_hrtimeout_range_clock
     0.10%  hostapd      [kernel.kallsyms]       [k] sock_poll
     0.10%  hostapd      [kernel.kallsyms]       [k] timespec_add_safe
     0.10%  hostapd      libc.so                 [.] 0x0000ef74
     0.10%  hostapd      libc.so                 [.] 0x000288e4
     0.10%  hostapd      libc.so                 [.] 0x000722fc
     0.10%  hostapd      wpad                    [.] 0x0001ab97
     0.10%  hostapd      wpad                    [.] 0x0001b1a1
     0.10%  hostapd      wpad                    [.] 0x0001cbd9
     0.10%  hostapd      wpad                    [.] 0x0001cd0d
     0.10%  hostapd      wpad                    [.] 0x0004c589
     0.10%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_htc_rx_completion_handler
     0.10%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_htt_rx_amsdu_allowed.isra.4
     0.10%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_htt_tx_inc_pending
     0.10%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_htt_tx_txq_update
     0.10%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_mac_tx_frm_has_freq
     0.10%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_mac_tx_h_get_txmode.isra.4
     0.10%  ksoftirqd/0  [ath10k_core]           [k]
ath10k_mac_tx_push_pending
     0.10%  ksoftirqd/0  [ath10k_core]           [k] ath10k_process_rx
     0.10%  ksoftirqd/0  [ath10k_pci]            [k]
ath10k_ce_completed_recv_next_nolock
     0.10%  ksoftirqd/0  [ath9k]                 [k] ath_rx_tasklet
     0.10%  ksoftirqd/0  [ath9k_common]          [k]
ath9k_cmn_process_rssi
     0.10%  ksoftirqd/0  [ath9k_hw]              [k]
ath9k_hw_numtxpending
     0.10%  ksoftirqd/0  [cfg80211]              [k]
__ieee80211_data_to_8023
     0.10%  ksoftirqd/0  [cfg80211]              [k] cfg80211_rx_mgmt
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] __build_skb
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] __do_softirq
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] ag71xx_tx_packets
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] c0_hpt_read
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] consume_skb
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] idr_mark_full
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] idr_remove
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] ip_forward_finish
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] kmem_cache_alloc
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] memmove
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] mod_timer
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] number.isra.12
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k]
passthru_features_check
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] skb_copy_bits
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] skb_pull
     0.10%  ksoftirqd/0  [kernel.kallsyms]       [k] skb_push
     0.10%  ksoftirqd/0  [mac80211]              [k]
__ieee80211_beacon_add_tim
     0.10%  ksoftirqd/0  [mac80211]              [k]
codel_should_drop.isra.4.constprop.20
     0.10%  ksoftirqd/0  [mac80211]              [k]
ieee80211_report_used_skb
     0.10%  ksoftirqd/0  [mac80211]              [k]
ieee80211_subif_start_xmit
     0.10%  ksoftirqd/0  [mac80211]              [k]
remove_monitor_info
     0.10%  ksoftirqd/0  [nf_conntrack]          [k] nf_ct_get_tuple
     0.10%  ksoftirqd/0  [nf_conntrack]          [k] tcp_get_timeouts
     0.10%  ksoftirqd/0  [nf_conntrack_ipv4]     [k] ipv4_pkt_to_tuple
     0.10%  ksoftirqd/0  [nf_nat]                [k] nf_ct_nat_ext_add


#
# (For a higher level overview, try: perf report --sort comm,dso)
#
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   Author of http://www.iptv-analyzer.org
>   LinkedIn: http://www.linkedin.com/in/brouer



-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org


More information about the Make-wifi-fast mailing list