Security Research & Defense

Subscribe to Security Research & Defense feed
Information from Microsoft about vulnerabilities, mitigations and workarounds, active attacks, security research, tools and guidance
Updated: 23 hours 39 min ago

Microsoft Security Servicing Criteria for Windows

Mon, 09/10/2018 - 05:55

One of our goals in the Microsoft Security Response Center (MSRC) is to be more transparent with security researchers and our customers on the criteria we use for determining when we intend to address a reported vulnerability through a security update. Our belief is that improving transparency on this topic helps provide clarity on how we assess risk, sets expectations for the types of vulnerabilities that we intend to service, and facilitates constructive dialogue as the threat landscape evolves over time. Ultimately, we believe this enables us all to work together to better protect Microsoft’s customers.

Toward this end, we released a draft version of the security servicing criteria for Windows in June, 2018. We received some great feedback from the research community and the broader security industry that we used to improve the clarity of this criteria. Today, we are happy to announce the publication of the first version of the security servicing criteria for Windows. We expect this to be a living document that evolves over time and we look forward to continuing the dialogue with the community on this topic.

Microsoft Security Servicing Criteria for Windows
Microsoft Vulnerability Severity Classification for Windows

Please reach out to us at switech@microsoft.com or @msftsecresponse on twitter to continue the discussion.

We’d like to acknowledge all of our partner teams from across Microsoft who helped to create and improve the clarity of this criteria.

Nate Warfield - Microsoft Security Response Center (MSRC)

Categories: Security News

Vulnerability hunting with Semmle QL, part 1

Thu, 08/16/2018 - 15:30

Previously on this blog, we’ve talked about how MSRC automates the root cause analysis of vulnerabilities reported and found. After doing this, our next step is variant analysis: finding and investigating any variants of the vulnerability. It’s important that we find all such variants and patch them simultaneously, otherwise we bear the risk of these being exploited in the wild. In this post, I’d like to explain the automation we use in variant finding.

For the past year or so, we’ve been augmenting our manual code review processes with Semmle, a third-party static analysis environment. It compiles code to a relational database (the snapshot database – a combination of database and source code), which is queried using Semmle QL, a declarative, object-oriented query language designed for program analysis.

The basic workflow is that, after root cause analysis, we write queries to find code patterns that are semantically similar to the original vulnerability. Any results are triaged as usual and provided to our engineering teams for a fix to be developed. Also, the queries are placed in a central repository to be re-run periodically by MSRC and other security teams. This way, we can scale our variant finding over time and across multiple codebases.

In addition to variant analysis, we’ve been using QL proactively, in our security reviews of source code. This will be the topic of a future blog post. For now, let’s look at some real-world examples inspired by MSRC cases.

Incorrect integer overflow checks

This first case is a bug that’s straightforward to define, but would be tedious to find variants of in a large codebase.

The code below shows a common idiom for detecting overflow on the addition of unsigned integers:

if (x + y < x) {
    // handle integer overflow
}

Unfortunately, this does not work properly when the width of the integer type is small enough to be subject to integral promotion. For example, if x and y were both unsigned short, when compiled, the above code would end up being equivalent to (unsigned int)x + y < x, making this overflow check ineffective.

Here’s a QL query that matches this code pattern:

import cpp
 
from AddExpr a, Variable v, RelationalOperation r
where a.getAnOperand() = v.getAnAccess()
  and r.getAnOperand() = v.getAnAccess()
  and r.getAnOperand() = a
  and forall(Expr op | op = a.getAnOperand() | op.getType().getSize() < 4)
  and not a.getExplicitlyConverted().getType().getSize() < 4
select r, "Useless overflow check due to integral promotion"

In the from clause, we define the variables, and their types, to be used in the rest of the query. AddExpr, Variable, and RelationalOperation are QL classes representing various sets of entities in the snapshot database, e.g. RelationalOperation covers every expression with a relational operation (less than, greater than, etc.)

The where clause is the meat of the query, using logical connectives and quantifiers to define how to relate the variables. Breaking it down, this means that the addition expression and the relational operation need the same variable as one of their operands (x in the example code above):

a.getAnOperand() = v.getAnAccess() and r.getAnOperand() = v.getAnAccess()

The other operand of the relational operation must be the addition:

r.getAnOperand() = a

Both operands of the addition must have a width less than 32 bits (4 bytes):

forall(Expr op | op = a.getAnOperand() | op.getType().getSize() < 4)

But if there is an explicit cast on the addition expression, we’re not interested if it’s less than 32 bits:

not a.getExplicitlyConverted().getType().getSize() < 4

(This is so a check like (unsigned short)(x + y) < x doesn’t get flagged by the query.)

Finally, the select clause defines the result set.

This vulnerability was originally reported in Chakra (the JavaScript engine of Edge), where the consequence of that particular ineffective overflow check was memory corruption. The query matched the original vulnerability but no additional variants in Chakra. However, we discovered several when applying this exact query to other Windows components.

Unsafe use of SafeInt

An alternative to rolling your own integer overflow checks is to use a library with such checks built in. SafeInt is a C++ template class that overrides arithmetic operators to throw an exception where overflow is detected.

Here’s an example of how to use it correctly:

int x, y, z;
 
// ...
 
z = SafeInt<int>(x) + y;

But here is an example of how it can be unintentionally misused – the expression passed to the constructor may already have overflowed:

int x, y, z;
 
// ...
 
z = SafeInt<int>(x + y);

How to write a query to detect this? In the previous example, our query only used built-in QL classes. For this one, let’s start by defining our own. For this, we choose one or more QL classes to subclass from (with extends), and define a characteristic predicate which specifies those entities in the snapshot database that are matched by the class:

class SafeInt extends Type {
  SafeInt() {
    this.getUnspecifiedType().getName().matches("SafeInt<%")
  }
}

The QL class Type represents the set of all types in the snapshot database. For the QL class SafeInt, we subset this to just types with a name that begins with “SafeInt<”, thus indicating that they are instantiations of the SafeInt template class. The getUnspecifiedType() predicate is used to disregard typedefs and type specifiers such as const.

Next, we define the subset of expressions that could potentially overflow. Most arithmetic operations can overflow, but not all; this uses QL’s instanceof operator to define which ones. We use a recursive definition because we need expressions such as (x+1)/y to be included, even though x/y is not.

class PotentialOverflow extends Expr {
  PotentialOverflow() {
    (this instanceof BinaryArithmeticOperation    // match   x+y x-y x*y
       and not this instanceof DivExpr            // but not x/y
       and not this instanceof RemExpr)           //      or x%y
 
    or (this instanceof UnaryArithmeticOperation  // match   x++ x-- ++x --x -x
          and not this instanceof UnaryPlusExpr)  // but not +x
 
    // recursive definitions to capture potential overflow in
    // operands of the operations excluded above
    or this.(BinaryArithmeticOperation).getAnOperand() instanceof PotentialOverflow
    or this.(UnaryPlusExpr).getOperand() instanceof PotentialOverflow
  }
}

Finally, we relate these two classes in a query:

from PotentialOverflow po, SafeInt si
where po.getParent().(Call).getTarget().(Constructor).getDeclaringType() = si
select
    po,
    po + " may overflow before being converted to " + si

.(Call) and .(Constructor) are examples of an inline cast, which, similar to instanceof, is another way of restricting which QL classes match. In this case we are saying that, given an expression that may overflow, we’re only interested if its parent expression is some sort of call. Furthermore, we only want to know if the target of that call is a constructor, and if it’s a constructor for some SafeInt.

Like the previous example, this was a query that provided a number of actionable results across multiple codebases.

JavaScript re-entrancy to use-after-free

This next example was a vulnerability in Edge caused by re-entrancy into JavaScript code.

Edge defines many functions that can be called from JavaScript. This model function illustrates the essence of the vulnerability:

HRESULT SomeClass::vulnerableFunction(Var* args, UINT argCount, Var* retVal)
{
    // get first argument -
    // from Chakra, acquire pointer to array
    BYTE* pBuffer;
    UINT bufferSize;
    hr = Jscript::GetTypedArrayBuffer(args[1], &pBuffer, &bufferSize);
 
    // get second argument –
    // from Chakra, obtain an integer value
    int someValue;
    hr = Jscript::VarToInt(args[2], &someValue);
 
    // perform some operation on the array acquired previously
    doSomething(pBuffer, bufferSize);
 

The problem was that when Edge calls back into Chakra, e.g. during VarToInt, arbitrary JavaScript code may be executed. The above function could be exploited by passing it a JavaScript object that overrides valueOf to free the buffer, so when VarToInt returns, pBuffer is a dangling pointer:

var buf = new ArrayBuffer(length);
var arr = new Uint8Array(buf);
 
var param = {}
param.valueOf = function() {
    /* free `buf`
       (code to actually do this would be defined elsewhere) */
    neuter(buf);  // neuter `buf` by e.g. posting it to a web worker
    gc();         // trigger garbage collection
    return 0;
};
 
vulnerableFunction(arr, param);

The specific pattern we’re looking for with QL is therefore: acquisition of a pointer from GetTypedArrayBuffer, followed by a call to some Chakra function that may execute JavaScript, followed by some use of the pointer.

For the array buffer pointer, we match on the calls to GetTypedArrayBuffer, where the second argument (getArgument of Call is zero-indexed) is an address-of expression (&), and take its operand:

class TypedArrayBufferPointer extends Expr {
    TypedArrayBufferPointer() {
        exists(Call c | c.getTarget().getName() = "GetTypedArrayBuffer" and
               this = c.getArgument(1).(AddressOfExpr).getOperand())
    }
}

The exists logical quantifier is used here to introduce a new variable (c) into the scope.

There are several Chakra API functions that could be used for JavaScript re-entrancy. Rather than defining them by name, we can identify the internal Chakra function that performs this task, and use QL to figure this out from the call graph:

// examine call graph to match any function that may eventually call MethodCallToPrimitive
predicate mayExecJsFunction(Function f) {
    exists(Function g | f.calls+(g) and g.hasName("MethodCallToPrimitive")
    }
 
// this defines a call to any of the above functions
class MayExecJsCall extends FunctionCall {
    MayExecJsCall() {
        mayExecJsFunction(this)
    }
}

The “+” suffix of the calls predicate specifies a transitive closure – it applies the predicate to itself until there is a match. This permits a concisely defined exploration of the function call graph.

Finally, this query brings these QL class definitions together, relating them by control flow:

from TypedArrayBufferPointer def, MayExecJsCall call, VariableAccess use, Variable v
where v.getAnAccess() = def
  and v.getAnAccess() = use
  and def.getASuccessor+() = call
  and call.getASuccessor+() = use
select use,
    "Call to " + call + " between definition " + def + " and use " + use

The predicate getASuccessor() specifies the next statement or expression in the control flow. Therefore, using e.g. call.getASuccessor+() = use will follow the control flow graph from call until there is a match to use. The diagram below illustrates this:

    This query uncovered four variants in addition to the originally reported vulnerability, all of them assessed as critical severity.

    That’s all for now. The next instalment will cover using QL for data flow analysis and taint tracking, with examples from our security review of an Azure firmware component.

     

    Steven Hunter, MSRC Vulnerabilities & Mitigations team

    Categories: Security News

    Analysis and mitigation of L1 Terminal Fault (L1TF)

    Mon, 08/13/2018 - 21:24

    In January 2018, Microsoft released an advisory and security updates for a new class of hardware vulnerabilities involving speculative execution side channels (known as Spectre and Meltdown). In this blog post, we will provide a technical analysis of a new speculative execution side channel vulnerability known as L1 Terminal Fault (L1TF) which has been assigned CVE-2018-3615 (for SGX), CVE-2018-3620 (for operating systems and SMM), and CVE-2018-3646 (for virtualization). This vulnerability affects Intel Core processors and Intel Xeon processors.

    This post is primarily geared toward security researchers and engineers who are interested in a technical analysis of L1TF and the mitigations that are relevant to it. If you are interested in more general guidance, please refer to Microsoft's security advisory for L1TF.

    Please note that the information in this post is current as of the date of this post.

    L1 Terminal Fault (L1TF) overview

    We previously defined four categories of speculation primitives that can be used to create the conditions for speculative execution side channels. Each category provides a fundamental method for entering speculative execution along a non-architectural path, specifically: conditional branch misprediction, indirect branch misprediction, exception delivery or deferral, and memory access misprediction. L1TF belongs to the exception delivery or deferral category of speculation primitives (along with Meltdown and Lazy FP State Restore) as it deals with speculative (or out-of-order) execution related to logic that generates an architectural exception. In this post, we’ll provide a general summary of L1TF. For a more in-depth analysis, please refer to the advisory and whitepaper that Intel has published for this vulnerability.

    L1TF arises due to a CPU optimization related to the handling of address translations when performing a page table walk. When translating a linear address, the CPU may encounter a terminal page fault which occurs when the paging structure entry for a virtual address is not present (Present bit is 0) or otherwise invalid. This will result in an exception, such as a page fault, or TSX transaction abort along the architectural path. However, before either of these occur, a CPU that is vulnerable to L1TF may initiate a read from the L1 data cache for the linear address being translated. For this speculative-only read, the page frame bits of the terminal (not present) page table entry are treated as a system physical address, even for guest page table entries. If the cache line for the physical address is present in the L1 data cache, then the data for that line may be forwarded on to dependent operations that may execute speculatively before retirement of the instruction that led to the terminal page fault. The behavior related to L1TF can occur for page table walks involving both conventional and extended page tables (the latter of which is used for virtualization).

    To illustrate how this might occur, it may help to consider the following simplified example. In this example, an attacker-controlled virtual machine (VM) has constructed a page table hierarchy within the VM with the goal of reading a desired system (host) physical address. The following diagram provides an example hierarchy for the virtual address 0x12345000 where the terminal page table entry is not present but contains a page frame of 0x9a0 as shown below:

    After setting up this hierarchy, the VM could then attempt to read from system physical addresses within [0x9a0000, 0x9a1000) through an instruction sequence such as the following:

    01: 4C0FB600 movzx r8,byte [rax] ; rax = 0x12345040 02: 49C1E00C shl r8,byte 0xc 03: 428B0402 mov eax,[rdx+r8] ; rdx = address of signal array

    By executing these instructions within a TSX transaction or by handling the architectural page fault, the VM could attempt to induce a speculative load from the L1 data cache line associated with the system physical address 0x9a0040 (if present in the L1) and have the first byte of that cache line forwarded to an out-of-order load that uses this byte as an offset into a signal array. This would create the conditions for observing the byte value using a disclosure primitive such as FLUSH+RELOAD, thereby leading to the disclosure of information across a security boundary in the case where this system physical address has not been allocated to the VM.

    While the scenario described above illustrates how L1TF can apply to inferring physical memory across a virtual machine boundary (where the VM has full control of the guest page tables), it is also possible for L1TF to be exploited in other scenarios. For example, a user mode application could attempt to use L1TF to read from physical addresses referred to by not present terminal page table entries within their own address space. In practice, it is common for operating systems to make use of the software bits in the not present page table entry format for storing metadata which could equate to valid physical page frames. This could allow a process to read physical memory not assigned to the process (or VM, in a virtualized scenario) or that is not intended to be accessible within the process (e.g. PAGE_NOACCESS memory on Windows).

    Mitigations for L1 Terminal Fault (L1TF)

    There are multiple mitigations for L1TF and they vary based on the attack category being mitigated. To illustrate this, we’ll describe the software security models that are at risk for L1TF and the specific tactics that can be employed to mitigate it. We’ll reuse the mitigation taxonomy from our previous post on mitigating speculative execution side channels for this. In many cases, the mitigations described in this section need to be combined in order to provide a broad defense for L1TF.

    Relevance to software security models

    The following table summarizes the potential relevance of L1TF to various intra-device attack scenarios that software security models are typically concerned with. Unlike Meltdown (CVE-2017-5754) which only affected the kernel-to-user scenario, L1TF is applicable to all intra-device attack scenarios as indicated by the orange cells (gray cells would have indicated not applicable). This is because L1TF can potentially provide the ability to read arbitrary system physical memory.

    Attack Category Attack Scenario L1TF Inter-VM Hypervisor-to-guest CVE-2018-3646 Host-to-guest CVE-2018-3646 Guest-to-guest CVE-2018-3646 Intra-OS Kernel-to-user CVE-2018-3620 Process-to-process CVE-2018-3620 Intra-process CVE-2018-3620 Enclave Enclave-to-any CVE-2018-3615 VSM-to-any CVE-2018-3646 Preventing speculation techniques involving L1TF

    As we’ve noted in the past, one of the best ways to mitigate a vulnerability is by addressing the issue as close to the root cause as possible. In the case of L1TF, there are multiple mitigations that can be used to prevent speculation techniques involving L1TF.

    Safe page frame bits in not present page table entries

    One of the requirements for an attack involving L1TF is that the page frame bits of a terminal page table entry must refer to a valid physical page that contains sensitive data from another security domain. This means a compliant hypervisor and operating system kernel can mitigate certain attack scenarios for L1TF by ensuring that either 1) the physical page referred to by the page frame bits of not present page table entries always contain benign data and/or 2) a high order bit is set in the page frame bits that does not correspond to accessible physical memory. In the case of #2, the Windows kernel will use a bit that is less than the implemented physical address bits supported by a given processor in order to avoid physical address truncation (e.g. dropping the high order bit).

    Beginning with the August, 2018 Windows security updates, all supported versions of the Windows kernel and the Hyper-V hypervisor ensure that #1 and #2 are automatically enforced on hardware that is vulnerable to L1TF. This is enforced both for conventional page table entries and extended page table entries that are not present. On Windows Server, this mitigation is disabled by default and must be enabled as described in our published guidance for Windows Server.

    To illustrate how this works, consider the following example of a user mode virtual address that is not accessible and therefore has a not present PTE. In this example, the page frame bits still refer to what could be interpreted as a valid physical address in conjunction with L1TF:

    26: kd> !pte 0x00000281`d84c0000 … PTE at FFFFB30140EC2600 … contains 0000000356CDEB00 … not valid … Transition: 356cde … Protect: 18 - No Access 26: kd> dt nt!HARDWARE_PTE FFFFB30140EC2600 +0x000 Valid : 0y0 +0x000 Write : 0y0 +0x000 Owner : 0y0 +0x000 WriteThrough : 0y0 +0x000 CacheDisable : 0y0 +0x000 Accessed : 0y0 +0x000 Dirty : 0y0 +0x000 LargePage : 0y0 +0x000 Global : 0y1 +0x000 CopyOnWrite : 0y1 +0x000 Prototype : 0y0 +0x000 reserved0 : 0y1 +0x000 PageFrameNumber : 0y000000000000001101010110110011011110 (0x356cde) +0x000 reserved1 : 0y0000 +0x000 SoftwareWsIndex : 0y00000000000 (0) +0x000 NoExecute : 0y0

    With the August, 2018 Windows security updates applied, it’s possible to observe the behavior of setting a high order bit in the not present page table entry that refers to physical memory that is either inaccessible or guaranteed to be benign (in this case bit 45). Since this does not correspond to an accessible physical address, any attempt to read from it using L1TF will fail.

    17: kd> !pte 0x00000196`04840000 … PTE at FFFF8000CB024200 … contains 0000200129CB2B00 … not valid … Transition: 200129cb2 … Protect: 18 - No Access 17: kd> dt nt!HARDWARE_PTE FFFF8000CB024200 +0x000 Valid : 0y0 +0x000 Write : 0y0 +0x000 Owner : 0y0 +0x000 WriteThrough : 0y0 +0x000 CacheDisable : 0y0 +0x000 Accessed : 0y0 +0x000 Dirty : 0y0 +0x000 LargePage : 0y0 +0x000 Global : 0y1 +0x000 CopyOnWrite : 0y1 +0x000 Prototype : 0y0 +0x000 reserved0 : 0y1 +0x000 PageFrameNumber : 0y001000000000000100101001110010110010 (0x200129cb2) +0x000 reserved1 : 0y0000 +0x000 SoftwareWsIndex : 0y00000000000 (0) +0x000 NoExecute : 0y0

    In order to provide a portable method of allowing VMs to determine the implemented physical address bits supported on a system, the Hyper-V hypervisor Top-Level Functional Specification (TLFS) has been revised with a defined interface that can be used by a VM to query this information. This facilitates safe migration of virtual machines within a migration pool.

    Flush L1 data cache on security domain transition

    Disclosing information through the use of L1TF requires sensitive data from a victim security domain to be present in the L1 data cache (note, the L1D is shared by all LPs on the same physical core). This means disclosure can be prevented by flushing the L1 data cache when transitioning between security domains. To facilitate this, Intel has provided new capabilities through a microcode update that supports an architectural interface for flushing the L1 data cache.

    Beginning with the August, 2018 Windows security updates, the Hyper-V hypervisor now uses the new L1 data cache flush feature when present to ensure that VM data is removed from the L1 data cache at critical points. On Windows Server 2016+ and Windows 10 1607+, the flush occurs when switching virtual processor contexts between VMs. This helps reduce the performance impact of the flush by minimizing the number of times this needs to occur. On previous versions of Windows, the flush occurs prior to executing a VM (e.g. prior to VMENTRY).

    For L1 data cache flushing in the Hyper-V hypervisor to be robust, the flush is performed in combination with safe use or disablement of HyperThreading and per-virtual-processor hypervisor address spaces.

    For SGX enclave scenarios, the microcode update provided by Intel ensures that the L1 data cache is flushed any time the logical processor exits enclave execution mode. The microcode update also supports attestation of whether HT has been enabled by the BIOS. When HT is enabled, there is a possibility of L1TF attacks from a sibling logical processor before enclave secrets in L1 data cache are flushed or cleared. The entity verifying the attestation may reject attestations from a HT-enabled system if it deems the risk of L1TF attacks from the sibling logic processor to not be acceptable.

    Safe scheduling of sibling logical processors

    Intel’s HyperThreading (HT) technology, also known as simultaneous multithreading (SMT), allows multiple logical processors (LPs) to execute simultaneously on a physical core. Each sibling LP can be simultaneously executing code in different security domains and privilege modes. For example, one LP could be executing in the hypervisor while another is executing code within a VM. This has implications for the L1 data cache flush because it may be possible for sensitive data to reenter the L1 data cache via a sibling LP after the L1 data cache flush occurs.

    In order to prevent this from happening, the execution of code on sibling LPs must be safely scheduled or HT must be disabled. Both of these approaches ensure that the L1 data cache for a core does not become polluted with data from another security domain after a flush occurs.

    The Hyper-V hypervisor on Windows Server 2016 and above supports a feature known as the core scheduler which ensures that virtual processors executing on a physical core always belong to the same VM and are described to the VM as sibling hyperthreads. This feature requires administrator opt-in for Windows Server 2016 and is enabled by default starting with Windows Server 2019. This, in combination with per-virtual-processor hypervisor address spaces, is what makes it possible to defer the L1 data cache flush to the point at which a core begins executing a virtual processor from a different VM rather than needing to perform the flush on every VMENTRY. For more details on how this is implemented in Hyper-V, please refer to the in-depth Hyper-V technical blog on this topic.

    The following diagram illustrates the differences in virtual processor scheduling policies for a scenario with two different VMs (VM 1 and VM 2). As the diagram shows, without core scheduling enabled it is possible for code from two different VMs to execute simultaneously on a core (in this case core 2), whereas this is not possible with core scheduling enabled.

    On versions of Windows prior to Windows Server 2016 and for all versions of Windows Client with virtualization enabled, the core scheduler feature is not supported and it may therefore be necessary to disable HT in order to ensure the robustness of the L1 data cache flush for inter-VM isolation. This is also currently necessary on Windows Server 2016+ for scenarios that make use of Virtual Secure Mode (VSM) for isolation of secrets. When HT is disabled, it becomes impossible for sibling logical processors to execute simultaneously on the same physical core. For guidance on how to disable HT on Windows, please refer to our advisory.

    Removing sensitive content from memory

    Another tactic for mitigating speculative execution side channels is to remove sensitive content from the address space such that it cannot be disclosed through speculative execution.

    Per-virtual-processor address spaces

    Until the emergence of speculative execution side channels, there was not a strong need for hypervisors to partition their virtual address space on a per-VM basis. As a result, it has been common practice for hypervisors to maintain a virtual mapping of all physical memory to simplify memory accesses. The existence of L1TF and other speculative execution side channels has made it desirable to eliminate cross-VM secrets from the virtual address space of the hypervisor when it is acting on behalf of a VM.

    Beginning with the August, 2018 security update, the Hyper-V hypervisor in Windows Server 2016+ and Windows 10 1607+ now uses per-virtual-processor (and hence per-VM) address spaces and also no longer maps all of physical memory into the virtual address space of the hypervisor. This ensures that only memory that is allocated to the VM and the hypervisor on behalf of the VM is potentially accessible during speculation for a given virtual processor. In the case of L1TF, this mitigation works in combination with the L1 data cache flush and safe use or disablement of HT to ensure that no sensitive cross-VM information becomes available in the L1.

    Mitigation applicability

    The mitigations that were described in the previous sections work in combination to provide broad protection for L1TF. The following tables provide a summary of the attack scenarios and the relevant mitigations and default settings for different versions of Windows Server and Windows Client:

    Attack Category Windows Server version Windows Client version Windows Server 2016+ Pre-Windows Server 2016 Windows 10 1607+ Pre-Windows 10 1607 Inter-VM Enabled: per-virtual-processor address spaces, safe page frame bits

    Opt-in: L1 data cache flush, enable core scheduler or disable HT Enabled: safe page frame bits

    Opt-in: L1 data cache flush, disable HT Enabled: per-virtual-processor address spaces, safe page frame bits

    Opt-in: L1 data cache flush, disable HT Enabled: safe page frame bits

    Opt-in: L1 data cache flush, disable HT Intra-OS Opt-in: safe page frame bits Enabled: safe page frame bits Enclave Enabled (SGX): L1 data cache flush

    Opt-in (SGX/VSM): disable HT

    More concisely, the relationship between attack scenarios and mitigations for L1TF is summarized below:

    Mitigation Tactic Mitigation Name Inter-VM Intra-OS Enclave Prevent speculation techniques Flush L1 data cache on security domain transition Safe scheduling of sibling logical processors Safe page frame bits in not present page table entries Remove sensitive content from memory Per-virtual-processor address spaces Wrapping up

    In this post, we analyzed a new speculative execution side channel vulnerability known as L1 Terminal Fault (L1TF). This vulnerability affects a broad range of attack scenarios and the relevant mitigations require a combination of software and firmware (microcode) updates for systems with affected Intel processors. The discovery of L1TF demonstrates that research into speculative execution side channels is ongoing and we will continue to evolve our response and mitigation strategy accordingly. We continue to encourage researchers to report new discoveries through our Speculative Execution Side Channel bounty program.

    Matt Miller
    Microsoft Security Response Center (MSRC)

    Categories: Security News