11th Jan 2018 by Kurt Garloff

Outlook

With updated images, intel microcode updates, the KPTI kernel workarounds in our infrastructure, and workarounds in our KVM and XEN kypervisors, we will have addressed Spectre-2 and Meltdown-3, thus resolving all known scenarios that threaten process, container and virtualization isolation. More remains to be done.

The KPTI approach to mitigate the Meltdown-3 attack vector is well understood and can be considered mature. It has a performance price; on CPUs that support address space identifiers, that seems to be <10% in many real-world scenarios. Some minor improvements may be found but the author does not expect radical improvements here.

Specte-2 (BTB poisoning target injection) is more complex; as of calendar week 2, the mitigation mechanisms are still in discussion. intel's approach with using microcode updates that limit branch prediction when crossing protection boundaries together with kernel (IBRS) and hypervisor patches seem to comprehensively address the issue. But the performance cost on Broadwell (v4) seems huge -- we hope that the final microcode update addresses this.

For some processors (fortunately non in OTC), microcode updates are not available -- alternative approaches such as Google's retpolines might be used. Other approaches are thinkable.

What is not fully addressed this way is the somewhat less severe Spectre-1 vulnerability. So userspace applications and especially interpreter/JIT runtimes will need additional code changes most probably supported by compiler enhancements to make all their security checks safe again and prevent unauthorized data reads.

We expect the Spectre-1 updates to keep the industry busy for a while and expect to see many application updates over the next months to strengthen the protection against the described CPU issues. We will of course apply these updates when they affect us and provide them to customers in our online repository mirrors and via updated images.

The pattern -- CPUs not fully cleaning up microarchitectural state that is observable after aborted speculation -- may lead to future discoveries on new attack vectors. While cache effects have been exploited in the three attack vectors from the researchers, there is more microarchitectural state to be explored, such as cache snooping traffic on multiprocessor systems for example. The author expects that a few more attack approaches will be discovered and that not all of them will have been already addressed by the current mitigation mechanisms.

Future CPU generations will have to more be more careful on speculatively delaying permission checks and more careful about cleaning up. In order to not disable speculation completely, which would results in very poori single-thread performance, programmers (assisted by tools) may need to annotate checks that are security relevant to ensure that speculation can be limited selectively. It will take a while for CPU designers to come up with new and more secure designs and the software industry to understand that speculation may make regular checks not safe enough.

The CPU vendors knew about this issue now for half a year -- given the lead times in CPU engineering, it will probably take another year before a new generation of CPUs with better speculation containment will be available in mass quantities to replace the current flawed CPUs. One can only hope that the mitigations found and implemented by then are good enough to prevent major security breaches without reducing performance to a degree that they become impractical. We might also observe shifts in the market towards vendors that are less exposed by the design flaws or have proven to react better than others.

To protect themselves, IT users will always need to create systems where they have the ability to react quickly and deploy patches with a short lead time with confidence. This was true before (to deploy fixes for more traditional issues) and became even more true with the new class of issues. The continuous integration and deployment systems that are used by modern software engineering and DevOps practices are of course very helpful in this regard.