From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp73.iad3a.emailsrvr.com (smtp73.iad3a.emailsrvr.com [173.203.187.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 4FA423B29E for ; Fri, 5 Jan 2018 10:35:46 -0500 (EST) Received: from smtp18.relay.iad3a.emailsrvr.com (localhost [127.0.0.1]) by smtp18.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 0B807252DD; Fri, 5 Jan 2018 10:35:46 -0500 (EST) X-SMTPDoctor-Processed: csmtpprox beta Received: from smtp18.relay.iad3a.emailsrvr.com (localhost [127.0.0.1]) by smtp18.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 0523225303; Fri, 5 Jan 2018 10:35:46 -0500 (EST) Received: from app62.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by smtp18.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id D6846252DD; Fri, 5 Jan 2018 10:35:45 -0500 (EST) X-Sender-Id: dpreed@deepplum.com Received: from app62.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by 0.0.0.0:25 (trex/5.7.12); Fri, 05 Jan 2018 10:35:45 -0500 Received: from deepplum.com (localhost.localdomain [127.0.0.1]) by app62.wa-webapps.iad3a (Postfix) with ESMTP id C4CA241201; Fri, 5 Jan 2018 10:35:45 -0500 (EST) Received: by apps.rackspace.com (Authenticated sender: dpreed@deepplum.com, from: dpreed@deepplum.com) with HTTP; Fri, 5 Jan 2018 10:35:45 -0500 (EST) X-Auth-ID: dpreed@deepplum.com Date: Fri, 5 Jan 2018 10:35:45 -0500 (EST) From: "dpreed@deepplum.com" To: "Jonathan Morton" Cc: "Dave Taht" , "=?utf-8?Q?Joel_Wir=C4=81mu_Pauling?=" , cerowrt-devel@lists.bufferbloat.net MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_20180105103545000000_73081" Importance: Normal X-Priority: 3 (Normal) X-Type: html In-Reply-To: References: <1515103048.715224709@apps.rackspace.com> <1515103759.340132151@apps.rackspace.com> <1515106728.430510671@apps.rackspace.com> Message-ID: <1515166545.80420063@apps.rackspace.com> X-Mailer: webmail/12.9.10-RC Subject: Re: [Cerowrt-devel] Spectre and EBPF JIT X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Jan 2018 15:35:46 -0000 ------=_20180105103545000000_73081 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =0AOne of the most troubling "overreactions" is due to the fact that the PO= C by Google Project Zero describes an attack on the hypervisor host memory = under KVM.=0AIn fine print, and not very explicitly in the Project Zero des= cription, is that the version of KVM that was hacked was dependent on the h= ypervisor being mapped into the linear address space of the guest kernel.= =0AIn a hypervisor that uses VMX extensions, the EPT during guest execution= doesn't even provide addressability to the hypervisor code and data. (I ha= ven't inspected KVM's accelerated mode, but I can't see why it would have t= he EPT map non-guest memory. I know VMWare does not.)=0A =0AThis is validat= ed by a posting from QEMU re KVM, [ https://www.qemu.org/2018/01/04/spectre= / ]( https://www.qemu.org/2018/01/04/spectre/ ) , again a little hard to un= derstand if you don't know how VMX and EPT's work.=0A =0AWhat this means is= that older cloud VMs based on techniques used in paravirtualization (Xen, = ancient QEMU, older VMware) may be susceptible to accessing hypervisor stat= e via Spectre v1.=0A =0ABut newer so-called hardware-accelerated VMs based = on VMX extensions and using the EPT are isolated to a much larger extent, m= aking Spectre v1 pretty useless.=0A =0AThus, the "overreaction" is that ALL= VM's are problematic. This is very far from true. Hardware-accelerated VM= 's hypervisors are not vulnerable to Meltdown, Spectre v2, and probably not= Spectre v1.=0A =0AOf course, *within* a particular VM, the guest kernel an= d other processes are vulnerable. But there is no inter-VM path that has be= en demonstrated, nor do any of the discussions explain any means for using = speculative execution and branch misprediction between VMs running under di= fferent EPT's.=0A =0ASo for the cloud, and also for NVF's that are run on a= ccelerated HVM's, the problem is either non-existent or yet to be discovere= d.=0A =0AOf course the "press" wants everyone to be superafraid, so if they= can say "KVM is affected" that causes the mob to start running for the exi= ts!=0A =0ASummary: hardware virtualization appears to be a pragmatic form o= f isolation that works. And thus many cloud providers are fine.=0A =0A =0A = =0A-----Original Message-----=0AFrom: "Jonathan Morton" =0ASent: Friday, January 5, 2018 9:07am=0ATo: "Dave Taht" =0ACc: "dpreed@deepplum.com" , "Joel Wir=C4=81= mu Pauling" , cerowrt-devel@lists.bufferbloat.net=0ASubj= ect: Re: [Cerowrt-devel] Spectre and EBPF JIT=0A=0A=0A=0A> On 5 Jan, 2018, = at 6:53 am, Dave Taht wrote:=0A> =0A> It took me a lo= ng while to digest that one. The branch predictor=0A> analysis of haswell w= as easiest to understand (and AMD claims to have=0A> an AI based one), and = perhaps scrambling that at random intervals=0A> would help? (this stuff is = now way above my pay grade)=0A=0ASoftware mitigations for all three attacks= have been developed during the "responsible disclosure" period.=0A=0ASpect= re v1: adding an LFENCE instruction (memory load fence) to JIT code perform= ing a bounds-checked array read. This is basically a userspace fix for a us= erspace attack. Firefox just got this, Chrome undoubtedly will too, if it h= asn't already.=0A=0ASpectre v2: three different mitigations are appropriate= for different families of CPU:=0A=0A https://lkml.org/lkml/2018/1/4/742=0A= =0AOn AMD CPUs, the small risk actually existing (because AMD's BTB is much= less prone to poisoning than Intel's) is erased by adding LFENCE to privil= eged indirect branches. This has only a very small cost.=0A=0AOn Intel CPUs= until Broadwell inclusive (and Silvermont onwards), a "retpoline" structur= e is necessary and sufficient. This has a bigger cost than LFENCE and is pr= etty ugly to look at, but it's still relatively minor.=0A=0AOn Skylake, Kab= y Lake and Coffee Lake, something more exotic is required - I think it invo= lves temporarily disabling the BTB during privileged indirect branches. Tha= t's *really* ugly, and involves tweaking poorly-documented MSRs.=0A=0ASomet= hing similar in nature to the above should also work for affected ARM cores= .=0A=0AMeltdown: nothing is required for AMD CPUs. Unmapping the privileged= addresses when returning to userspace is sufficient for Intel, but incurs = a big performance overhead for syscalls. The same is likely true for any ot= her affected CPUs.=0A=0A - Jonathan Morton=0A=0A ------=_20180105103545000000_73081 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

One of the most troubl= ing "overreactions" is due to the fact that the POC by Google Project Zero = describes an attack on the hypervisor host memory under KVM.

=0A

In fine print, and not very explicitly in the Project Zero des= cription, is that the version of KVM that was hacked was dependent on the h= ypervisor being mapped into the linear address space of the guest kernel.=0A

In a hypervisor that uses VMX extensions, the EPT= during guest execution doesn't even provide addressability to the hypervis= or code and data. (I haven't inspected KVM's accelerated mode, but I can't = see why it would have the EPT map non-guest memory. I know VMWare does not.= )

=0A

 

=0A

This is vali= dated by a posting from QEMU re KVM, https://www.qemu.org/2018/01/04/spectre/ , again a littl= e hard to understand if you don't know how VMX and EPT's work.

=0A

 

=0A

What this means is that ol= der cloud VMs based on techniques used in paravirtualization (Xen, ancient = QEMU, older VMware) may be susceptible to accessing hypervisor state via Sp= ectre v1.

=0A

 

=0A

But = newer so-called hardware-accelerated VMs based on VMX extensions and using = the EPT are isolated to a much larger extent, making Spectre v1 pretty usel= ess.

=0A

 

=0A

Thus, the= "overreaction" is that ALL VM's are problematic.  This is very far fr= om true. Hardware-accelerated VM's hypervisors are not vulnerable to Meltdo= wn, Spectre v2, and probably not Spectre v1.

=0A

&nb= sp;

=0A

Of course, *within* a particular VM, the gue= st kernel and other processes are vulnerable. But there is no inter-VM path= that has been demonstrated, nor do any of the discussions explain any mean= s for using speculative execution and branch misprediction between VMs runn= ing under different EPT's.

=0A

 

=0A

So for the cloud, and also for NVF's that are run on accelerat= ed HVM's, the problem is either non-existent or yet to be discovered.

= =0A

 

=0A

Of course the "pr= ess" wants everyone to be superafraid, so if they can say "KVM is affected"= that causes the mob to start running for the exits!

=0A

 

=0A

Summary: hardware virtualization app= ears to be a pragmatic form of isolation that works. And thus many cloud pr= oviders are fine.

=0A

 

=0A

 

=0A

 

=0A

---= --Original Message-----
From: "Jonathan Morton" <chromatix99@gmail.= com>
Sent: Friday, January 5, 2018 9:07am
To: "Dave Taht" <= dave.taht@gmail.com>
Cc: "dpreed@deepplum.com" <dpreed@deepplum.= com>, "Joel Wir=C4=81mu Pauling" <joel@aenertia.net>, cerowrt-deve= l@lists.bufferbloat.net
Subject: Re: [Cerowrt-devel] Spectre and EBPF = JIT

=0A
=0A

> On 5 Jan, 2018, at 6:53 am, Dave Taht <dave.taht@gmail.com> = wrote:
>
> It took me a long while to digest that one. The= branch predictor
> analysis of haswell was easiest to understand (= and AMD claims to have
> an AI based one), and perhaps scrambling t= hat at random intervals
> would help? (this stuff is now way above = my pay grade)

Software mitigations for all three attacks have be= en developed during the "responsible disclosure" period.

Spectre= v1: adding an LFENCE instruction (memory load fence) to JIT code performin= g a bounds-checked array read. This is basically a userspace fix for a user= space attack. Firefox just got this, Chrome undoubtedly will too, if it has= n't already.

Spectre v2: three different mitigations are appropr= iate for different families of CPU:

https://lkml.org/lkml/2018/= 1/4/742

On AMD CPUs, the small risk actually existing (because A= MD's BTB is much less prone to poisoning than Intel's) is erased by adding = LFENCE to privileged indirect branches. This has only a very small cost.
On Intel CPUs until Broadwell inclusive (and Silvermont onwards), = a "retpoline" structure is necessary and sufficient. This has a bigger cost= than LFENCE and is pretty ugly to look at, but it's still relatively minor= .

On Skylake, Kaby Lake and Coffee Lake, something more exotic i= s required - I think it involves temporarily disabling the BTB during privi= leged indirect branches. That's *really* ugly, and involves tweaking poorly= -documented MSRs.

Something similar in nature to the above shoul= d also work for affected ARM cores.

Meltdown: nothing is require= d for AMD CPUs. Unmapping the privileged addresses when returning to usersp= ace is sufficient for Intel, but incurs a big performance overhead for sysc= alls. The same is likely true for any other affected CPUs.

- Jo= nathan Morton

=0A
------=_20180105103545000000_73081--