From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-x244.google.com (mail-qk0-x244.google.com [IPv6:2607:f8b0:400d:c09::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 9E2B23B260 for ; Sat, 19 Nov 2016 12:33:52 -0500 (EST) Received: by mail-qk0-x244.google.com with SMTP id y205so5435525qkb.1 for ; Sat, 19 Nov 2016 09:33:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=jBMmVt/IhBvhA9eofzuKB/UeISzLWYb14G2gHVvIKkE=; b=ZG3r0NXSYKVXlWkT/9L4byGOu96T3KjsbaGeKAKRwFDAanOhndJAlTEJcKz+0uY0Aa yN/z7ZVytiJPsT6UmMFUgD61NVlJmdLWUFwgxOOZkDWwXuv0iuSST1vyGygP1T2xea2H KUd7xSNcjMWg/5RYjnrtkvobTyigRvpIqbvKLMAr5v6CYVxY6og4k3G2eqYxP2o8/twe t5UqUem/ive9piklF7n0VYkxo57vEkzUk1WYG9nV5Rpgl3xAIanQymyoklvzQyotg49g mPMyLQbOld+gBlb8RVcUkdT0cI1PWPxyw8k05VTj0n88LvL5mGanqc1RBncKjzcw/srG yxXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=jBMmVt/IhBvhA9eofzuKB/UeISzLWYb14G2gHVvIKkE=; b=f2o/5sRzwAholMU209QddLKQDGafLlHDK7V+KsR9vtBvPdiNP/dnS4ci7gkpECfrfp edlsyhBkuMN3fqC/WsUmttjDBqwnJ8L6K6yLsYb0KGP5R7eHEFR+DJbWuGmBXHIOVBqG uOD5hDmZWXepHOt3QxZ4c7vO+f1O9Kaa1P3F1gRjB+k0SkOvW51FPab46uoZnrWYH8IZ ELonk4pnSiVdryfpCXMTc9BzZsFTA1zElZTfn56vEKonjfGEuIp2P1/t2eD0I16Sekpf vm9UI/RYTcCWOjqNUEkHlpY4uXxSk39lgnEAGEte8f8R/PalhL6QvBkleGwotPyxD+46 o7aA== X-Gm-Message-State: AKaTC03i0NbEGP7fSoCdIgu2BKJlfJJNmzrqqz543SeKPWLe+TWtzkArY73SvMaOpMDPu/3vYNe4+4MMOc3j7g== X-Received: by 10.55.201.154 with SMTP id m26mr7371535qkl.59.1479576831926; Sat, 19 Nov 2016 09:33:51 -0800 (PST) MIME-Version: 1.0 Received: by 10.12.137.198 with HTTP; Sat, 19 Nov 2016 09:33:51 -0800 (PST) In-Reply-To: <20161118085524.62d6cfdd@redhat.com> References: <20161118085524.62d6cfdd@redhat.com> From: Dave Taht Date: Sat, 19 Nov 2016 09:33:51 -0800 Message-ID: To: Jesper Dangaard Brouer Cc: make-wifi-fast@lists.bufferbloat.net Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Make-wifi-fast] a bit of profiling on the archer X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Nov 2016 17:33:52 -0000 On Thu, Nov 17, 2016 at 11:55 PM, Jesper Dangaard Brouer wrote: > > On Thu, 17 Nov 2016 20:14:49 -0800 Dave Taht wrote: > >> I have not been profiling much on lower end platforms (it's hard, you >> can crash a box pretty easily with the wrong options or sample rates). > > I'm happy to hear that perf does work on this lower end HW, although > the disclaimer of sample rates. > > Does anyone know if hardware based PMU (Performance Monitor Units) > exists for these kind of devices? > >> While watching the ath10k peak at 150-200mbits, at 99% of cpu in >> softirq, I spent a bit of time profiling various counters and >> statistics. >> >> for this one (while downloading 12 flows at the same time via flent) >> >> perf record -F 99 -e cpu-clock -ag -- sleep 10 >> perf report > > Below perf report is not well suite for email, could you instead > provide output from command below: > > perf report --no-children --stdio --call-graph none Thanks! That is way more readable output. Appended at the end of this post. > >> >> [[31m 67.81%[[m 0.00% ksoftirqd/0 [kernel.kallsyms] [k] >> run_ksoftirqd >> | >> ---run_ksoftirqd >> | >> |[[31m--67.61%-- [[m __do_softirq >> | | >> | |[[31m--66.80%-- [[m net_rx_action >> | | | >> | | |[[31m--41.07%-- [[m ag71xx_poll >> >> ... > > (Looks like you managed to copy-paste the terminal escape codes for > colors) Not sure what terminal emulation perf is expecting. TERM is set to XTERM. >> >> It appears we're spending 66% of the time in the *ethernet* portion of >> the path. > > Be careful you don't fool yourself. In your output you have the > "children" mode on, so everything being called "under" net_rx_action is > summed up. It could be it goes all the way through to the wifi TX > parts and that is part of the sum. Even the memory allocations gets > summed into this 66% number. > > >> I'm going to stop worrying so much about the performance of the new >> wifi algorithms. for at least 24 hours # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 990 of event 'cpu-clock' # Event count (approx.): 9999999900 # # Overhead Command Shared Object Symbol # ........ ........... ...................... ......................................... # 4.34% ksoftirqd/0 [ebtables] [k] ebt_do_table 2.93% ksoftirqd/0 [kernel.kallsyms] [k] __netif_receive_skb_core 2.83% ksoftirqd/0 [ip_tables] [k] ipt_do_table 2.53% ksoftirqd/0 [kernel.kallsyms] [k] nf_iterate 2.42% ksoftirqd/0 [mac80211] [k] ieee80211_tx_dequeue 2.02% ksoftirqd/0 [kernel.kallsyms] [k] __local_bh_enable_ip 2.02% ksoftirqd/0 [mac80211] [k] __ieee80211_subif_start_xmit 1.82% ksoftirqd/0 [kernel.kallsyms] [k] __dev_queue_xmit 1.72% ksoftirqd/0 [nf_conntrack] [k] tcp_packet 1.52% ksoftirqd/0 [mac80211] [k] ieee80211_prepare_and_rx_handle 1.52% ksoftirqd/0 [mac80211] [k] ieee80211_rx_handlers 1.41% ksoftirqd/0 [kernel.kallsyms] [k] br_handle_frame 1.41% ksoftirqd/0 [kernel.kallsyms] [k] ip_finish_output2 1.41% ksoftirqd/0 [kernel.kallsyms] [k] ip_forward 1.41% ksoftirqd/0 [nf_conntrack] [k] __nf_conntrack_find_get 1.31% ksoftirqd/0 [kernel.kallsyms] [k] format_decode 1.31% ksoftirqd/0 [mac80211] [k] ieee80211_xmit_fast_finish 1.21% ksoftirqd/0 [mac80211] [k] ieee80211_tx_h_select_key 1.11% ksoftirqd/0 [kernel.kallsyms] [k] __copy_user_common 1.11% ksoftirqd/0 [mac80211] [k] ieee80211_queue_skb 1.01% ksoftirqd/0 [ath10k_core] [k] ath10k_htt_txrx_compl_task 1.01% ksoftirqd/0 [ath10k_core] [k] ath10k_mac_tx.isra.35 1.01% ksoftirqd/0 [kernel.kallsyms] [k] dev_hard_start_xmit 1.01% ksoftirqd/0 [kernel.kallsyms] [k] eth_type_trans 1.01% ksoftirqd/0 [kernel.kallsyms] [k] netif_skb_features 1.01% ksoftirqd/0 [nf_conntrack_rtcache] [k] nf_rtcache_in.part.2 0.91% ksoftirqd/0 [ath10k_pci] [k] ath10k_pci_hif_tx_sg 0.91% ksoftirqd/0 [kernel.kallsyms] [k] ip_rcv 0.91% ksoftirqd/0 [kernel.kallsyms] [k] memcmp 0.91% ksoftirqd/0 [mac80211] [k] ieee80211_rx_napi 0.81% ksoftirqd/0 [ath10k_core] [k] ath10k_dbg 0.81% ksoftirqd/0 [ath10k_core] [k] ath10k_htt_tx 0.81% ksoftirqd/0 [ath10k_pci] [k] ath10k_bus_pci_write32 0.81% ksoftirqd/0 [br_netfilter] [k] br_nf_pre_routing 0.81% ksoftirqd/0 [ebtable_broute] [k] ebt_broute 0.81% ksoftirqd/0 [kernel.kallsyms] [k] ag71xx_poll 0.81% ksoftirqd/0 [kernel.kallsyms] [k] ip_output 0.81% ksoftirqd/0 [kernel.kallsyms] [k] skb_get_hash_perturb 0.81% ksoftirqd/0 [mac80211] [k] sta_info_hash_lookup 0.81% ksoftirqd/0 [nf_conntrack_ipv4] [k] ipv4_helper 0.81% kworker/0:1 [kernel.kallsyms] [k] __delay 0.71% ksoftirqd/0 [kernel.kallsyms] [k] __bzero 0.71% ksoftirqd/0 [kernel.kallsyms] [k] __rmemcpy 0.71% ksoftirqd/0 [kernel.kallsyms] [k] _find_next_bit.part.0 0.71% ksoftirqd/0 [kernel.kallsyms] [k] ip_rcv_finish 0.71% ksoftirqd/0 [mac80211] [k] ieee80211_lookup_ra_sta 0.71% ksoftirqd/0 [nf_conntrack] [k] hash_conntrack_raw 0.71% ksoftirqd/0 [nf_conntrack] [k] nf_conntrack_in 0.71% ksoftirqd/0 [nf_conntrack] [k] tcp_error 0.61% ksoftirqd/0 [br_netfilter] [k] br_nf_post_routing 0.61% ksoftirqd/0 [cfg80211] [k] cfg80211_classify8021d 0.61% ksoftirqd/0 [iptable_raw] [k] iptable_raw_hook 0.61% ksoftirqd/0 [kernel.kallsyms] [k] br_handle_frame_finish 0.61% ksoftirqd/0 [kernel.kallsyms] [k] netdev_pick_tx 0.61% ksoftirqd/0 [kernel.kallsyms] [k] r4k_dma_cache_wback_inv 0.61% ksoftirqd/0 [kernel.kallsyms] [k] skb_network_protocol 0.61% ksoftirqd/0 [nf_conntrack_rtcache] [k] nf_ct_rtcache_find_usable 0.61% ksoftirqd/0 [nf_nat_ipv4] [k] nf_nat_ipv4_fn 0.51% ksoftirqd/0 [ath10k_pci] [k] ath10k_ce_send_nolock 0.51% ksoftirqd/0 [kernel.kallsyms] [k] __skb_flow_get_ports 0.51% ksoftirqd/0 [kernel.kallsyms] [k] atomic64_add 0.51% ksoftirqd/0 [kernel.kallsyms] [k] br_dev_queue_push_xmit 0.51% ksoftirqd/0 [kernel.kallsyms] [k] br_fdb_update 0.51% ksoftirqd/0 [kernel.kallsyms] [k] dst_release 0.51% ksoftirqd/0 [kernel.kallsyms] [k] idr_get_empty_slot 0.51% ksoftirqd/0 [kernel.kallsyms] [k] r4k_dma_cache_inv 0.51% ksoftirqd/0 [nf_conntrack] [k] nf_ct_deliver_cached_events 0.51% ksoftirqd/0 [nf_conntrack_ipv4] [k] ipv4_conntrack_in 0.51% ksoftirqd/0 [nf_nat] [k] nf_nat_packet 0.40% ksoftirqd/0 [ath10k_core] [k] ath10k_dbg_dump 0.40% ksoftirqd/0 [ath10k_core] [k] ath10k_mac_op_wake_tx_queue 0.40% ksoftirqd/0 [ath10k_pci] [k] ath10k_bus_pci_read32 0.40% ksoftirqd/0 [ath10k_pci] [k] ath10k_ce_completed_send_next_nolock 0.40% ksoftirqd/0 [br_netfilter] [k] ip_sabotage_in 0.40% ksoftirqd/0 [kernel.kallsyms] [k] __br_fdb_get 0.40% ksoftirqd/0 [kernel.kallsyms] [k] __delay 0.40% ksoftirqd/0 [kernel.kallsyms] [k] __skb_flow_dissect 0.40% ksoftirqd/0 [kernel.kallsyms] [k] __wake_up_sync_key 0.40% ksoftirqd/0 [kernel.kallsyms] [k] memcpy 0.40% ksoftirqd/0 [kernel.kallsyms] [k] sch_direct_xmit 0.40% ksoftirqd/0 [kernel.kallsyms] [k] skb_release_data 0.40% ksoftirqd/0 [kernel.kallsyms] [k] validate_xmit_skb.isra.30.part.31 0.40% ksoftirqd/0 [mac80211] [k] fq_flow_classify.constprop.17 0.40% ksoftirqd/0 [mac80211] [k] ieee80211_tx_status 0.40% ksoftirqd/0 [nf_conntrack] [k] __nf_ct_refresh_acct 0.40% ksoftirqd/0 [nf_conntrack_ipv4] [k] ipv4_get_l4proto 0.40% ksoftirqd/0 [nf_conntrack_rtcache] [k] nf_rtcache_forward 0.40% ksoftirqd/0 [nf_conntrack_rtcache] [k] nf_rtcache_forward4 0.40% ksoftirqd/0 [nf_nat_ipv4] [k] nf_nat_ipv4_in 0.40% ksoftirqd/0 [nf_nat_ipv4] [k] nf_nat_ipv4_out 0.30% ksoftirqd/0 [ath10k_core] [k] __ath10k_htt_rx_ring_fill_n 0.30% ksoftirqd/0 [ath10k_core] [k] ath10k_htt_rx_h_deliver 0.30% ksoftirqd/0 [ath10k_core] [k] ath10k_htt_rx_h_mpdu 0.30% ksoftirqd/0 [ath10k_core] [k] ath10k_mac_tx_push_txq 0.30% ksoftirqd/0 [ath10k_pci] [k] ath10k_ce_completed_send_next 0.30% ksoftirqd/0 [ath10k_pci] [k] ath10k_pci_write32 0.30% ksoftirqd/0 [br_netfilter] [k] br_nf_dev_xmit 0.30% ksoftirqd/0 [br_netfilter] [k] nf_bridge_encap_header_len 0.30% ksoftirqd/0 [cfg80211] [k] ieee80211_hdrlen 0.30% ksoftirqd/0 [kernel.kallsyms] [k] __free_page_frag 0.30% ksoftirqd/0 [kernel.kallsyms] [k] ag71xx_hard_start_xmit 0.30% ksoftirqd/0 [kernel.kallsyms] [k] br_dev_xmit 0.30% ksoftirqd/0 [kernel.kallsyms] [k] br_pass_frame_up 0.30% ksoftirqd/0 [kernel.kallsyms] [k] build_skb 0.30% ksoftirqd/0 [kernel.kallsyms] [k] dev_gro_receive 0.30% ksoftirqd/0 [kernel.kallsyms] [k] idr_alloc 0.30% ksoftirqd/0 [kernel.kallsyms] [k] inet_gro_receive 0.30% ksoftirqd/0 [kernel.kallsyms] [k] ip_finish_output 0.30% ksoftirqd/0 [kernel.kallsyms] [k] nf_hook_slow 0.30% ksoftirqd/0 [kernel.kallsyms] [k] tcp_gro_receive 0.30% ksoftirqd/0 [mac80211] [k] codel_dequeue_func 0.30% ksoftirqd/0 [mac80211] [k] fq_flow_dequeue 0.30% ksoftirqd/0 [mac80211] [k] ieee80211_drop_unencrypted.part.3 0.30% ksoftirqd/0 [mac80211] [k] ieee80211_get_bssid 0.30% ksoftirqd/0 [nf_conntrack_ipv4] [k] ipv4_confirm 0.30% ksoftirqd/0 [nf_defrag_ipv4] [k] ipv4_conntrack_defrag 0.20% ksoftirqd/0 [ath10k_core] [k] ath10k_txrx_tx_unref 0.20% ksoftirqd/0 [ath10k_pci] [k] ath10k_pci_htt_rx_cb 0.20% ksoftirqd/0 [br_netfilter] [k] br_nf_local_in 0.20% ksoftirqd/0 [ebtable_filter] [k] ebt_out_hook 0.20% ksoftirqd/0 [ebtable_nat] [k] ebt_nat_in 0.20% ksoftirqd/0 [ebtable_nat] [k] ebt_nat_out 0.20% ksoftirqd/0 [iptable_mangle] [k] iptable_mangle_hook 0.20% ksoftirqd/0 [kernel.kallsyms] [k] __dma_sync 0.20% ksoftirqd/0 [kernel.kallsyms] [k] __free_pages_ok 0.20% ksoftirqd/0 [kernel.kallsyms] [k] __netif_receive_skb 0.20% ksoftirqd/0 [kernel.kallsyms] [k] __slab_alloc.isra.13.constprop.17 0.20% ksoftirqd/0 [kernel.kallsyms] [k] __slab_free.isra.14 0.20% ksoftirqd/0 [kernel.kallsyms] [k] br_deliver 0.20% ksoftirqd/0 [kernel.kallsyms] [k] finish_task_switch 0.20% ksoftirqd/0 [kernel.kallsyms] [k] ipv4_dst_check 0.20% ksoftirqd/0 [kernel.kallsyms] [k] is_skb_forwardable 0.20% ksoftirqd/0 [kernel.kallsyms] [k] kmem_cache_free 0.20% ksoftirqd/0 [kernel.kallsyms] [k] ktime_get 0.20% ksoftirqd/0 [kernel.kallsyms] [k] net_rx_action 0.20% ksoftirqd/0 [kernel.kallsyms] [k] skb_release_head_state 0.20% ksoftirqd/0 [kernel.kallsyms] [k] vsnprintf 0.20% ksoftirqd/0 [mac80211] [k] ieee80211_deliver_skb 0.20% ksoftirqd/0 [mac80211] [k] ieee80211_select_queue 0.20% ksoftirqd/0 [mac80211] [k] ieee80211_skb_resize 0.20% ksoftirqd/0 [nf_conntrack_rtcache] [k] nf_rtcache_in4 0.10% hostapd [kernel.kallsyms] [k] __copy_user_common 0.10% hostapd [kernel.kallsyms] [k] core_sys_select 0.10% hostapd [kernel.kallsyms] [k] datagram_poll 0.10% hostapd [kernel.kallsyms] [k] do_select 0.10% hostapd [kernel.kallsyms] [k] finish_task_switch 0.10% hostapd [kernel.kallsyms] [k] handle_sys 0.10% hostapd [kernel.kallsyms] [k] hrtimer_try_to_cancel 0.10% hostapd [kernel.kallsyms] [k] netlink_recvmsg 0.10% hostapd [kernel.kallsyms] [k] poll_initwait 0.10% hostapd [kernel.kallsyms] [k] schedule_hrtimeout_range_clock 0.10% hostapd [kernel.kallsyms] [k] sock_poll 0.10% hostapd [kernel.kallsyms] [k] timespec_add_safe 0.10% hostapd libc.so [.] 0x0000ef74 0.10% hostapd libc.so [.] 0x000288e4 0.10% hostapd libc.so [.] 0x000722fc 0.10% hostapd wpad [.] 0x0001ab97 0.10% hostapd wpad [.] 0x0001b1a1 0.10% hostapd wpad [.] 0x0001cbd9 0.10% hostapd wpad [.] 0x0001cd0d 0.10% hostapd wpad [.] 0x0004c589 0.10% ksoftirqd/0 [ath10k_core] [k] ath10k_htc_rx_completion_handler 0.10% ksoftirqd/0 [ath10k_core] [k] ath10k_htt_rx_amsdu_allowed.isra.4 0.10% ksoftirqd/0 [ath10k_core] [k] ath10k_htt_tx_inc_pending 0.10% ksoftirqd/0 [ath10k_core] [k] ath10k_htt_tx_txq_update 0.10% ksoftirqd/0 [ath10k_core] [k] ath10k_mac_tx_frm_has_freq 0.10% ksoftirqd/0 [ath10k_core] [k] ath10k_mac_tx_h_get_txmode.isra.4 0.10% ksoftirqd/0 [ath10k_core] [k] ath10k_mac_tx_push_pending 0.10% ksoftirqd/0 [ath10k_core] [k] ath10k_process_rx 0.10% ksoftirqd/0 [ath10k_pci] [k] ath10k_ce_completed_recv_next_nolock 0.10% ksoftirqd/0 [ath9k] [k] ath_rx_tasklet 0.10% ksoftirqd/0 [ath9k_common] [k] ath9k_cmn_process_rssi 0.10% ksoftirqd/0 [ath9k_hw] [k] ath9k_hw_numtxpending 0.10% ksoftirqd/0 [cfg80211] [k] __ieee80211_data_to_8023 0.10% ksoftirqd/0 [cfg80211] [k] cfg80211_rx_mgmt 0.10% ksoftirqd/0 [kernel.kallsyms] [k] __build_skb 0.10% ksoftirqd/0 [kernel.kallsyms] [k] __do_softirq 0.10% ksoftirqd/0 [kernel.kallsyms] [k] ag71xx_tx_packets 0.10% ksoftirqd/0 [kernel.kallsyms] [k] c0_hpt_read 0.10% ksoftirqd/0 [kernel.kallsyms] [k] consume_skb 0.10% ksoftirqd/0 [kernel.kallsyms] [k] idr_mark_full 0.10% ksoftirqd/0 [kernel.kallsyms] [k] idr_remove 0.10% ksoftirqd/0 [kernel.kallsyms] [k] ip_forward_finish 0.10% ksoftirqd/0 [kernel.kallsyms] [k] kmem_cache_alloc 0.10% ksoftirqd/0 [kernel.kallsyms] [k] memmove 0.10% ksoftirqd/0 [kernel.kallsyms] [k] mod_timer 0.10% ksoftirqd/0 [kernel.kallsyms] [k] number.isra.12 0.10% ksoftirqd/0 [kernel.kallsyms] [k] passthru_features_check 0.10% ksoftirqd/0 [kernel.kallsyms] [k] skb_copy_bits 0.10% ksoftirqd/0 [kernel.kallsyms] [k] skb_pull 0.10% ksoftirqd/0 [kernel.kallsyms] [k] skb_push 0.10% ksoftirqd/0 [mac80211] [k] __ieee80211_beacon_add_tim 0.10% ksoftirqd/0 [mac80211] [k] codel_should_drop.isra.4.constprop.20 0.10% ksoftirqd/0 [mac80211] [k] ieee80211_report_used_skb 0.10% ksoftirqd/0 [mac80211] [k] ieee80211_subif_start_xmit 0.10% ksoftirqd/0 [mac80211] [k] remove_monitor_info 0.10% ksoftirqd/0 [nf_conntrack] [k] nf_ct_get_tuple 0.10% ksoftirqd/0 [nf_conntrack] [k] tcp_get_timeouts 0.10% ksoftirqd/0 [nf_conntrack_ipv4] [k] ipv4_pkt_to_tuple 0.10% ksoftirqd/0 [nf_nat] [k] nf_ct_nat_ext_add # # (For a higher level overview, try: perf report --sort comm,dso) # > > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > Author of http://www.iptv-analyzer.org > LinkedIn: http://www.linkedin.com/in/brouer --=20 Dave T=C3=A4ht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org