Security Research & Defense
In January 2018, Microsoft released an advisory and security updates for a new class of hardware vulnerabilities involving speculative execution side channels (known as Spectre and Meltdown). In this blog post, we will provide a technical analysis of a new speculative execution side channel vulnerability known as L1 Terminal Fault (L1TF) which has been assigned CVE-2018-3615 (for SGX), CVE-2018-3620 (for operating systems and SMM), and CVE-2018-3646 (for virtualization). This vulnerability affects Intel Core processors and Intel Xeon processors.
This post is primarily geared toward security researchers and engineers who are interested in a technical analysis of L1TF and the mitigations that are relevant to it. If you are interested in more general guidance, please refer to Microsoft's security advisory for L1TF.
Please note that the information in this post is current as of the date of this post.L1 Terminal Fault (L1TF) overview
We previously defined four categories of speculation primitives that can be used to create the conditions for speculative execution side channels. Each category provides a fundamental method for entering speculative execution along a non-architectural path, specifically: conditional branch misprediction, indirect branch misprediction, exception delivery or deferral, and memory access misprediction. L1TF belongs to the exception delivery or deferral category of speculation primitives (along with Meltdown and Lazy FP State Restore) as it deals with speculative (or out-of-order) execution related to logic that generates an architectural exception. In this post, we’ll provide a general summary of L1TF. For a more in-depth analysis, please refer to the advisory and whitepaper that Intel has published for this vulnerability.
L1TF arises due to a CPU optimization related to the handling of address translations when performing a page table walk. When translating a linear address, the CPU may encounter a terminal page fault which occurs when the paging structure entry for a virtual address is not present (Present bit is 0) or otherwise invalid. This will result in an exception, such as a page fault, or TSX transaction abort along the architectural path. However, before either of these occur, a CPU that is vulnerable to L1TF may initiate a read from the L1 data cache for the linear address being translated. For this speculative-only read, the page frame bits of the terminal (not present) page table entry are treated as a system physical address, even for guest page table entries. If the cache line for the physical address is present in the L1 data cache, then the data for that line may be forwarded on to dependent operations that may execute speculatively before retirement of the instruction that led to the terminal page fault. The behavior related to L1TF can occur for page table walks involving both conventional and extended page tables (the latter of which is used for virtualization).
To illustrate how this might occur, it may help to consider the following simplified example. In this example, an attacker-controlled virtual machine (VM) has constructed a page table hierarchy within the VM with the goal of reading a desired system (host) physical address. The following diagram provides an example hierarchy for the virtual address 0x12345000 where the terminal page table entry is not present but contains a page frame of 0x9a0 as shown below:
After setting up this hierarchy, the VM could then attempt to read from system physical addresses within [0x9a0000, 0x9a1000) through an instruction sequence such as the following:01: 4C0FB600 movzx r8,byte [rax] ; rax = 0x12345040 02: 49C1E00C shl r8,byte 0xc 03: 428B0402 mov eax,[rdx+r8] ; rdx = address of signal array
By executing these instructions within a TSX transaction or by handling the architectural page fault, the VM could attempt to induce a speculative load from the L1 data cache line associated with the system physical address 0x9a0040 (if present in the L1) and have the first byte of that cache line forwarded to an out-of-order load that uses this byte as an offset into a signal array. This would create the conditions for observing the byte value using a disclosure primitive such as FLUSH+RELOAD, thereby leading to the disclosure of information across a security boundary in the case where this system physical address has not been allocated to the VM.
While the scenario described above illustrates how L1TF can apply to inferring physical memory across a virtual machine boundary (where the VM has full control of the guest page tables), it is also possible for L1TF to be exploited in other scenarios. For example, a user mode application could attempt to use L1TF to read from physical addresses referred to by not present terminal page table entries within their own address space. In practice, it is common for operating systems to make use of the software bits in the not present page table entry format for storing metadata which could equate to valid physical page frames. This could allow a process to read physical memory not assigned to the process (or VM, in a virtualized scenario) or that is not intended to be accessible within the process (e.g. PAGE_NOACCESS memory on Windows).Mitigations for L1 Terminal Fault (L1TF)
There are multiple mitigations for L1TF and they vary based on the attack category being mitigated. To illustrate this, we’ll describe the software security models that are at risk for L1TF and the specific tactics that can be employed to mitigate it. We’ll reuse the mitigation taxonomy from our previous post on mitigating speculative execution side channels for this. In many cases, the mitigations described in this section need to be combined in order to provide a broad defense for L1TF.Relevance to software security models
The following table summarizes the potential relevance of L1TF to various intra-device attack scenarios that software security models are typically concerned with. Unlike Meltdown (CVE-2017-5754) which only affected the kernel-to-user scenario, L1TF is applicable to all intra-device attack scenarios as indicated by the orange cells (gray cells would have indicated not applicable). This is because L1TF can potentially provide the ability to read arbitrary system physical memory.Attack Category Attack Scenario L1TF Inter-VM Hypervisor-to-guest CVE-2018-3646 Host-to-guest CVE-2018-3646 Guest-to-guest CVE-2018-3646 Intra-OS Kernel-to-user CVE-2018-3620 Process-to-process CVE-2018-3620 Intra-process CVE-2018-3620 Enclave Enclave-to-any CVE-2018-3615 VSM-to-any CVE-2018-3646 Preventing speculation techniques involving L1TF
As we’ve noted in the past, one of the best ways to mitigate a vulnerability is by addressing the issue as close to the root cause as possible. In the case of L1TF, there are multiple mitigations that can be used to prevent speculation techniques involving L1TF.Safe page frame bits in not present page table entries
One of the requirements for an attack involving L1TF is that the page frame bits of a terminal page table entry must refer to a valid physical page that contains sensitive data from another security domain. This means a compliant hypervisor and operating system kernel can mitigate certain attack scenarios for L1TF by ensuring that either 1) the physical page referred to by the page frame bits of not present page table entries always contain benign data and/or 2) a high order bit is set in the page frame bits that does not correspond to accessible physical memory. In the case of #2, the Windows kernel will use a bit that is less than the implemented physical address bits supported by a given processor in order to avoid physical address truncation (e.g. dropping the high order bit).
Beginning with the August, 2018 Windows security updates, all supported versions of the Windows kernel and the Hyper-V hypervisor ensure that #1 and #2 are automatically enforced on hardware that is vulnerable to L1TF. This is enforced both for conventional page table entries and extended page table entries that are not present. On Windows Server, this mitigation is disabled by default and must be enabled as described in our published guidance for Windows Server.
To illustrate how this works, consider the following example of a user mode virtual address that is not accessible and therefore has a not present PTE. In this example, the page frame bits still refer to what could be interpreted as a valid physical address in conjunction with L1TF:26: kd> !pte 0x00000281`d84c0000 … PTE at FFFFB30140EC2600 … contains 0000000356CDEB00 … not valid … Transition: 356cde … Protect: 18 - No Access 26: kd> dt nt!HARDWARE_PTE FFFFB30140EC2600 +0x000 Valid : 0y0 +0x000 Write : 0y0 +0x000 Owner : 0y0 +0x000 WriteThrough : 0y0 +0x000 CacheDisable : 0y0 +0x000 Accessed : 0y0 +0x000 Dirty : 0y0 +0x000 LargePage : 0y0 +0x000 Global : 0y1 +0x000 CopyOnWrite : 0y1 +0x000 Prototype : 0y0 +0x000 reserved0 : 0y1 +0x000 PageFrameNumber : 0y000000000000001101010110110011011110 (0x356cde) +0x000 reserved1 : 0y0000 +0x000 SoftwareWsIndex : 0y00000000000 (0) +0x000 NoExecute : 0y0
With the August, 2018 Windows security updates applied, it’s possible to observe the behavior of setting a high order bit in the not present page table entry that refers to physical memory that is either inaccessible or guaranteed to be benign (in this case bit 45). Since this does not correspond to an accessible physical address, any attempt to read from it using L1TF will fail.17: kd> !pte 0x00000196`04840000 … PTE at FFFF8000CB024200 … contains 0000200129CB2B00 … not valid … Transition: 200129cb2 … Protect: 18 - No Access 17: kd> dt nt!HARDWARE_PTE FFFF8000CB024200 +0x000 Valid : 0y0 +0x000 Write : 0y0 +0x000 Owner : 0y0 +0x000 WriteThrough : 0y0 +0x000 CacheDisable : 0y0 +0x000 Accessed : 0y0 +0x000 Dirty : 0y0 +0x000 LargePage : 0y0 +0x000 Global : 0y1 +0x000 CopyOnWrite : 0y1 +0x000 Prototype : 0y0 +0x000 reserved0 : 0y1 +0x000 PageFrameNumber : 0y001000000000000100101001110010110010 (0x200129cb2) +0x000 reserved1 : 0y0000 +0x000 SoftwareWsIndex : 0y00000000000 (0) +0x000 NoExecute : 0y0
In order to provide a portable method of allowing VMs to determine the implemented physical address bits supported on a system, the Hyper-V hypervisor Top-Level Functional Specification (TLFS) has been revised with a defined interface that can be used by a VM to query this information. This facilitates safe migration of virtual machines within a migration pool.Flush L1 data cache on security domain transition
Disclosing information through the use of L1TF requires sensitive data from a victim security domain to be present in the L1 data cache (note, the L1D is shared by all LPs on the same physical core). This means disclosure can be prevented by flushing the L1 data cache when transitioning between security domains. To facilitate this, Intel has provided new capabilities through a microcode update that supports an architectural interface for flushing the L1 data cache.
Beginning with the August, 2018 Windows security updates, the Hyper-V hypervisor now uses the new L1 data cache flush feature when present to ensure that VM data is removed from the L1 data cache at critical points. On Windows Server 2016+ and Windows 10 1607+, the flush occurs when switching virtual processor contexts between VMs. This helps reduce the performance impact of the flush by minimizing the number of times this needs to occur. On previous versions of Windows, the flush occurs prior to executing a VM (e.g. prior to VMENTRY).
For L1 data cache flushing in the Hyper-V hypervisor to be robust, the flush is performed in combination with safe use or disablement of HyperThreading and per-virtual-processor hypervisor address spaces.
For SGX enclave scenarios, the microcode update provided by Intel ensures that the L1 data cache is flushed any time the logical processor exits enclave execution mode. The microcode update also supports attestation of whether HT has been enabled by the BIOS. When HT is enabled, there is a possibility of L1TF attacks from a sibling logical processor before enclave secrets in L1 data cache are flushed or cleared. The entity verifying the attestation may reject attestations from a HT-enabled system if it deems the risk of L1TF attacks from the sibling logic processor to not be acceptable.Safe scheduling of sibling logical processors
Intel’s HyperThreading (HT) technology, also known as simultaneous multithreading (SMT), allows multiple logical processors (LPs) to execute simultaneously on a physical core. Each sibling LP can be simultaneously executing code in different security domains and privilege modes. For example, one LP could be executing in the hypervisor while another is executing code within a VM. This has implications for the L1 data cache flush because it may be possible for sensitive data to reenter the L1 data cache via a sibling LP after the L1 data cache flush occurs.
In order to prevent this from happening, the execution of code on sibling LPs must be safely scheduled or HT must be disabled. Both of these approaches ensure that the L1 data cache for a core does not become polluted with data from another security domain after a flush occurs.
The Hyper-V hypervisor on Windows Server 2016 and above supports a feature known as the core scheduler which ensures that virtual processors executing on a physical core always belong to the same VM and are described to the VM as sibling hyperthreads. This feature requires administrator opt-in for Windows Server 2016 and is enabled by default starting with Windows Server 2019. This, in combination with per-virtual-processor hypervisor address spaces, is what makes it possible to defer the L1 data cache flush to the point at which a core begins executing a virtual processor from a different VM rather than needing to perform the flush on every VMENTRY. For more details on how this is implemented in Hyper-V, please refer to the in-depth Hyper-V technical blog on this topic.
The following diagram illustrates the differences in virtual processor scheduling policies for a scenario with two different VMs (VM 1 and VM 2). As the diagram shows, without core scheduling enabled it is possible for code from two different VMs to execute simultaneously on a core (in this case core 2), whereas this is not possible with core scheduling enabled.
On versions of Windows prior to Windows Server 2016 and for all versions of Windows Client with virtualization enabled, the core scheduler feature is not supported and it may therefore be necessary to disable HT in order to ensure the robustness of the L1 data cache flush for inter-VM isolation. This is also currently necessary on Windows Server 2016+ for scenarios that make use of Virtual Secure Mode (VSM) for isolation of secrets. When HT is disabled, it becomes impossible for sibling logical processors to execute simultaneously on the same physical core. For guidance on how to disable HT on Windows, please refer to our advisory.Removing sensitive content from memory
Another tactic for mitigating speculative execution side channels is to remove sensitive content from the address space such that it cannot be disclosed through speculative execution.Per-virtual-processor address spaces
Until the emergence of speculative execution side channels, there was not a strong need for hypervisors to partition their virtual address space on a per-VM basis. As a result, it has been common practice for hypervisors to maintain a virtual mapping of all physical memory to simplify memory accesses. The existence of L1TF and other speculative execution side channels has made it desirable to eliminate cross-VM secrets from the virtual address space of the hypervisor when it is acting on behalf of a VM.
Beginning with the August, 2018 security update, the Hyper-V hypervisor in Windows Server 2016+ and Windows 10 1607+ now uses per-virtual-processor (and hence per-VM) address spaces and also no longer maps all of physical memory into the virtual address space of the hypervisor. This ensures that only memory that is allocated to the VM and the hypervisor on behalf of the VM is potentially accessible during speculation for a given virtual processor. In the case of L1TF, this mitigation works in combination with the L1 data cache flush and safe use or disablement of HT to ensure that no sensitive cross-VM information becomes available in the L1.Mitigation applicability
The mitigations that were described in the previous sections work in combination to provide broad protection for L1TF. The following tables provide a summary of the attack scenarios and the relevant mitigations and default settings for different versions of Windows Server and Windows Client:Attack Category Windows Server version Windows Client version Windows Server 2016+ Pre-Windows Server 2016 Windows 10 1607+ Pre-Windows 10 1607 Inter-VM Enabled: per-virtual-processor address spaces, safe page frame bits
Opt-in: L1 data cache flush, enable core scheduler or disable HT Enabled: safe page frame bits
Opt-in: L1 data cache flush, disable HT Enabled: per-virtual-processor address spaces, safe page frame bits
Opt-in: L1 data cache flush, disable HT Enabled: safe page frame bits
Opt-in: L1 data cache flush, disable HT Intra-OS Opt-in: safe page frame bits Enabled: safe page frame bits Enclave Enabled (SGX): L1 data cache flush
Opt-in (SGX/VSM): disable HT
More concisely, the relationship between attack scenarios and mitigations for L1TF is summarized below:Mitigation Tactic Mitigation Name Inter-VM Intra-OS Enclave Prevent speculation techniques Flush L1 data cache on security domain transition Safe scheduling of sibling logical processors Safe page frame bits in not present page table entries Remove sensitive content from memory Per-virtual-processor address spaces Wrapping up
In this post, we analyzed a new speculative execution side channel vulnerability known as L1 Terminal Fault (L1TF). This vulnerability affects a broad range of attack scenarios and the relevant mitigations require a combination of software and firmware (microcode) updates for systems with affected Intel processors. The discovery of L1TF demonstrates that research into speculative execution side channels is ongoing and we will continue to evolve our response and mitigation strategy accordingly. We continue to encourage researchers to report new discoveries through our Speculative Execution Side Channel bounty program.
Microsoft Security Response Center (MSRC)
Today we’re announcing a change to the Mitigation Bypass Bounty that removes Control Flow Guard (CFG) from the set of in-scope mitigations. In this blog, we’ll provide additional background and explain why we’re making this change.Mitigation Bypass Bounty Background
Microsoft started the Mitigation Bypass Bounty in 2013 with the goal of helping us improve key defense-in-depth mitigation technologies by learning about bypasses. Since launching this program, we’ve awarded more than $1,000,000 in bounties and fixed numerous bypasses reported in our exploit mitigations and are looking forward to growing that number in the future.
One of the challenges we’ve faced with the Mitigation Bypass bounty program is providing clear guidance to researchers on what sorts of issues are in-scope vs. out-of-scope and what sort of cash reward can be expected. We’ve made several changes over the past few years to try to improve the situation here, such as:
- More clearly defining payout tiers for different types of mitigation bypasses (i.e. bugs vs. design problems).
- Being more transparent about the types of issues we are currently aware of so researchers know what types of bypasses are out of scope.
Even with these changes, we know we’re not perfect and we continue to listen to feedback and make changes to be more researcher friendly.Impact of Exploit Mitigations on Exploitation
One datapoint monitored by Microsoft is the occurrences of vulnerabilities being exploited in the wild. Microsoft has seen the amount of vulnerabilities exploited in the wild decrease steadily over the past 8 years.
We believe that part of the reason for the decline of known exploits in the wild is the increase in exploitation difficulty, which transitively affects the economics of vulnerability exploitation. We attribute a large part of the increased difficulty to Microsoft’s continued investment in exploit mitigation technologies such as CFG, Arbitrary Code Guard (ACG), Code Integrity Guard (CIG), MemGC, and so on.
Before we launched the Mitigation Bypass Bounty, we were more heavily reliant on analyzing exploits found in the wild to identify mitigation opportunities. This created lag between technique use in the wild and mitigation availability. To shorten this lag time, we launched the Mitigation Bypass Bounty to proactively learn about bypasses before they were used in the wild.
CFG has been a particularly popular Mitigation Bypass Bounty target for security researchers. Thanks to this research, we’ve learned a lot about a variety of bugs and design limitations affecting CFG. This has caused us to reevaluate the threat model that we need to defend against for more robust CFI. In order to do that, we know we will need to extend and improve the design of CFG, e.g. with finer-grained CFI, read-only memory protection, safe unwind/exception handling, and so on. We recently talked about the challenges with CFG and how our threat model has evolved (Video | Slides).
Microsoft has also received submissions and made fixes for other targets in the Mitigation Bypass Bounty, such as ACG. Researchers can expect that as we build new mitigations we will add them as bounty targets.Changes to the Mitigation Bypass Bounty Scope
As of today, CFG has been removed from the set of in-scope mitigations for the Mitigation Bypass Bounty. We believe we now have a good understanding of the limitations of CFG and the threat model we need to adapt the design to. We do not believe that additional research into CFG bypasses will be valuable until we’ve addressed these limitations and we would rather that researchers focus their attention on the other in-scope mitigations for the bounty. Although we are removing CFG from the bounty scope, we have no intention to remove or deprecate the feature and we still believe it is a valuable defense-in-depth mitigation. We look forward to bringing it back in scope once we’ve made improvements to CFG.
As always, we’d appreciate feedback from the community on this or any related topics.
MSRC Vulnerabilities & Mitigations Team
Microsoft’s commitment to protecting customers from vulnerabilities in our products, services, and devices includes providing security updates that address these vulnerabilities when they are discovered. We understand that researchers have wanted better clarity around the security features, boundaries and mitigations which exist in Windows and the servicing commitments which come with them. We have drafted a document which better describes the criteria Microsoft Security Response Center (MSRC) uses when determining whether a reported vulnerability will be addressed through servicing, or in the next version of a product. We are sharing the draft copy with the research community and would like feedback before we make the final copy available online. We are primarily interested in feedback around our servicing policies and whether our criteria makes sense to you, the researcher.
Please send feedback to email@example.com, thank you!
In January, 2018, Microsoft published an advisory and security updates for a new class of hardware vulnerabilities involving speculative execution side channels (known as Spectre and Meltdown). In this blog post, we will provide a technical analysis of an additional subclass of speculative execution side channel vulnerability known as Speculative Store Bypass (SSB) which has been assigned CVE-2018-3639. SSB was independently discovered by Ken Johnson of the Microsoft Security Response Center (MSRC) and Jann Horn (@tehjh) of Google Project Zero (GPZ).
This post is primarily geared toward security researchers and engineers who are interested in a technical analysis of SSB and the mitigations that are relevant to it. If you are interested in more general guidance, please refer to our advisory for Speculative Store Bypass and our knowledge base articles for Windows Server, Windows Client, and Microsoft cloud services.
Please note that the information in this post is current as of the date of this post.TL;DR
Before diving into the technical details, below is a brief summary of the CPUs that are affected by SSB, Microsoft’s assessment of the risk, and the mitigations identified to date.What is affected? AMD, ARM, and Intel CPUs are affected by CVE-2018-3639 to varying degrees. What is the risk? Microsoft currently assesses the risk posed by CVE-2018-3639 to our customers as low. We are not aware of any exploitable instances of this vulnerability class in our software at this time, but we are continuing to investigate and we encourage researchers to find and report any exploitable instances of CVE-2018-3639 as part of our Speculative Execution Side Channel Bounty program. We will adapt our mitigation strategy for CVE-2018-3639 as our understanding of the risk evolves. What is the mitigation? Microsoft has already released mitigations as part of our response to Spectre and Meltdown that are applicable to CVE-2018-3639 in certain scenarios, such as reducing timer precision in Microsoft Edge and Internet Explorer. Software developers can address individual instances of CVE-2018-3639 if they are discovered by introducing a speculation barrier instruction as described in Microsoft’s C++ developer guidance for speculative execution side channels.
Microsoft is working with CPU manufacturers to assess the availability and readiness of new hardware features that can be used to resolve CVE-2018-3639. In some cases, these features will require a microcode or firmware update to be installed. Microsoft plans to provide a mitigation that leverages the new hardware features in a future Windows update. Speculative Store Bypass (SSB) overview
In our blog post on mitigating speculative execution side channel hardware vulnerabilities, we described three speculation primitives that can be used to create the conditions for a speculative execution side channel. These three primitives provide the fundamental methods for entering speculative execution along a non-architectural path and consist of conditional branch misprediction, indirect branch misprediction, and exception delivery or deferral. Speculative Store Bypass (SSB) belongs to a new category of speculation primitive that we refer to as memory access misprediction.
SSB arises due to a CPU optimization that can allow a potentially dependent load instruction to be speculatively executed ahead of an older store. Specifically, if a load is predicted as not being dependent on a prior store, then the load can be speculatively executed before the store. If the prediction is incorrect, this can result in the load reading stale data and possibly forwarding that data onto other dependent micro-operations during speculation. This can potentially give rise to a speculative execution side channel and the disclosure of sensitive information.
To illustrate how this might occur, it may help to consider the following simple example. In this example, RDI and RSI are assumed to be equal to the same address on the architectural path.01: 88040F mov [rdi+rcx],al 02: 4C0FB6040E movzx r8,byte [rsi+rcx] 03: 49C1E00C shl r8,byte 0xc 04: 428B0402 mov eax,[rdx+r8]
In this example, the MOV instruction on line 1 may take additional time to execute (e.g. if the computation of the address expression for RDI+RCX is waiting on prior instructions to execute). If this occurs, the CPU may predict that the MOVZX is not dependent on the MOV and may speculatively execute it ahead of the MOV that performs the store. This can result in stale data from the memory located at RSI+RCX being loaded into R8 and fed to a dependent load on line 4. If the byte value in R8 is sensitive, then it may be observed through a side channel by leveraging a cache-based disclosure primitive such as FLUSH+RELOAD (if RDX refers to shared memory) or PRIME+PROBE. The CPU will eventually detect the misprediction and discard that state that was computed, but the data that was accessed during speculation may have created residual side effects in the cache by this point that can then be measured to infer the value that was loaded into R8.
This example is simplified for the purposes of explaining the issue, but it is possible to imagine generalizations of this concept that could occur. For example, it may be possible for similar sequences to exist where SSB could give rise to a speculative out-of-bounds read, type confusion, indirect branch, and so on. We have revised our C++ Developer Guidance for Speculative Execution Side Channels to include additional examples of code patterns and conditions that could give rise to an instance of CVE-2018-3639. In practice, finding an exploitable instance of CVE-2018-3639 will require an attacker to identify an instruction sequence where:
- The sequence is reachable across a trust boundary, e.g. an attacker in user mode can trigger the sequence in kernel mode through a system call.
- The sequence contains a load instruction that is architecturally dependent on a prior store.
- The stale data that is read by the load instruction is sensitive and is used in a way that can create a side channel on the non-architectural path, e.g. the data feeds a disclosure gadget.
- The store instruction does not execute before the load and the dependent instructions that compose the disclosure gadget are speculatively executed.
While our research into this new vulnerability class is ongoing, we have not identified instruction sequences that satisfy all of the above criteria and we are currently not aware of any exploitable instances of CVE-2018-3639 in our software.
There are multiple mitigations that are applicable to SSB. In our previous blog post on mitigating speculative execution side channels, we characterized the software security models that can generally be at risk and the various tactics for mitigating speculative execution side channels. We will reuse the previously established terminology from that post to frame the mitigation options available for SSB.Relevance to software security models
The following table summarizes the potential relevance of SSB to the various intra-device attack scenarios that software security models are typically concerned with. As with CVE-2017-5753 (Spectre variant 1), SSB is theoretically applicable to each attack scenario as indicated by the orange cells (grey cells indicate not applicable).Attack Category Attack Scenario Conditional branch misprediction Indirect branch misprediction Exception delivery or deferral CVE-2018-3639 (SSB) Inter-VM Hypervisor-to-guest Host-to-guest Guest-to-guest Intra-OS Kernel-to-user Process-to-process Intra-process Enclave Enclave-to-any Preventing speculation techniques involving SSB
As we’ve noted in the past, one of the best ways to mitigate a vulnerability is by addressing the issue as close to the root cause as possible. In the case of SSB, there are a few techniques that can be used to prevent speculation techniques that rely on SSB as the speculation primitive.Speculation barrier via serializing instruction
As with CVE-2017-5753 (Spectre variant 1), it is possible to mitigate SSB by using an instruction which is architecturally defined to serialize execution, thus acting as a speculation barrier. In the case of SSB, a serializing instruction (such as an LFENCE on x86/x64 and SSBB on ARM) can be inserted between the store instruction and the load that could be speculatively executed ahead of the store. For example, inserting an LFENCE on line 2 mitigates the simplified example from this post. Additional information can be found in the C++ Developer Guidance for Speculative Execution Side Channels.01: 88040F mov [rdi+rcx],al 02: 0FAEE8 lfence 03: 4C0FB6040E movzx r8,byte [rsi+rcx] 04: 49C1E00C shl r8,byte 0xc 05: 428B0402 mov eax,[rdx+r8] Speculative store bypass disable (SSBD)
In some cases, CPUs can provide facilities for inhibiting a speculative store bypass from occurring and can therefore offer a categorical mitigation for SSB. AMD, ARM, and Intel have documented new hardware features that can be used by software to accomplish this. Microsoft is working with AMD, ARM, and Intel to assess the availability and readiness of these features. In some cases, these features will require a microcode or firmware update to be installed. Microsoft plans to provide a mitigation that leverages the new hardware features in a future Windows update.Generally applicable mitigations for SSB
There are a number of previously described mitigations that are also generally applicable to SSB. These include mitigations that involve removing sensitive content from memory or removing observation channels. Generally speaking, the mitigation techniques for these two tactics that are effective against CVE-2017-5753 (Spectre variant 1) are also applicable to SSB.Applicability of mitigations
The complex nature of these issues makes it difficult to understand the relationship between mitigations, speculation techniques, and the attack scenarios to which they apply. This section provides tables to help describe these relationships. Some of the mitigation techniques mentioned in the tables below are described in our previous blog post on this subject.
The legend for the tables that follow is:Applicable Not applicable Mitigation relationship to attack scenarios
The following table summarizes the relationship between attack scenarios and applicable mitigations.Mitigation Tactic Mitigation Name Inter-VM Intra-OS Enclave Prevent speculation techniques Speculation barrier via execution serializing instruction Security domain CPU core isolation Indirect branch speculation barrier on demand and mode change Non-speculated or safely-speculated indirect branches Speculative Store Bypass Disable (SSBD) Remove sensitive content from memory Hypervisor address space segregation Split user and kernel page tables (“KVA Shadow”) Remove observation channels Map guest memory as noncacheable in root extended page tables Do not share physical pages across guests Decrease browser timer precision Mitigation relationship to variants
The following table summarizes the relationship among SSB and the Spectre and Meltdown variants, and applicable mitigations.Mitigation Tactic Mitigation Name CVE-2017-5753 (variant 1) CVE-2017-5715 (variant 2) CVE-2017-5754 (variant 3) CVE-2018-3639 (SSB) Prevent speculation techniques Speculation barrier via execution serializing instruction Security domain CPU core isolation Indirect branch speculation barrier on demand and mode change Non-speculated or safely-speculated indirect branches Speculative Store Bypass Disable (SSBD) Remove sensitive content from memory Hypervisor address space segregation Split user and kernel page tables (“KVA Shadow”) Remove observation channels Map guest memory as noncacheable in root extended page tables Do not share physical pages across guests Decrease browser timer precision Wrapping up
In this post, we analyzed a new class of speculative execution side channel hardware vulnerabilities known as Speculative Store Bypass (SSB). This analysis provided the basis for evaluating the risk associated with this class of vulnerability and the mitigation options that exist. As we noted in our previous post, research into speculative execution side channels is ongoing and we will continue to evolve our response and mitigations as we learn more. While we currently assess the risk of SSB as low, we encourage researchers to help further our understanding of the true risk and to report any exploitable instances of CVE-2018-3639 that may exist as part of our Speculative Execution Side Channel bounty program.
Microsoft Security Response Center (MSRC)
The security of Microsoft’s cloud services is a top priority for us. One of the technologies that is central to cloud security is Microsoft Hyper-V which we use to isolate tenants from one another in the cloud. Given the importance of this technology, Microsoft has made and continues to make significant investment in the security of Hyper-V and the powerful security features that it enables, such as Virtualization-Based Security (VBS). To reinforce this commitment, Microsoft offers rewards of up to $250,000 USD for the discovery of vulnerabilities in Hyper-V through our Hyper-V Bounty Program.
We would like to share with the security community that we have now released debugging symbols for many of the core components in Hyper-V, with some exceptions such as the hypervisor where we would like to avoid our customers taking a dependency on undocumented hypercalls for instance.
The symbols that have been made available allow security researchers to better analyze Hyper-V’s implementation and report any vulnerabilities that may exist as part of our Hyper-V Bounty Program. The list of the components that now have debugging symbols available can be found at this blogpost by the Microsoft Virtualization team.
We believe this is a step towards contributing more and more from our internal knowledge back to the security research community. As always, please let us know if you find any new vulnerabilities at firstname.lastname@example.org , or if you have any other questions @msftsecresponse.
MSRC Vulnerabilities and Mitigations Team