Applications running on large multi-socket machines having 1 to 100 TBs of memory suffer from non-uniform bandwidth and latency issues while accessing physical memory. To mitigate these effects, past research has focused only on data placement policies on larger non-uniform memory access (NUMA) machines.
A research team led by VMware researcher Jayneel Gandhi has discovered another performance issue on NUMA machines which was previously ignored – sub-optimal page-table placement. To resolve the issue, the team proposed a new design, called Mitosis, for boosting application performance on large machines by migrating and replicating page-tables instead of application data across sockets.
Their study, “Mitosis: Transparently Self-Replicating Page-Tables for Large-Memory Machines,” is the first to make the case for explicit page-table allocation policy. The study shows that page-table placement is becoming crucial to application performance on large machines, which are becoming increasingly more prevalent.
The team implemented Mitosis in Linux and evaluated its benefits on real hardware, showing that it improves performance for big-data multi-socket applications by up to 1.3 times. Moreover, Mitosis improves application performance by up to 3.2 times in cases where the operating system scheduler migrates a process across sockets.
“A three-times increase in performance is a huge improvement for customers who use Big Data applications like databases on large machines,” said Jayneel Gandhi. “When you’re ordering an airline ticket, for example, your ticket processing becomes a lot faster. And from the airline’s perspective, they can sell a lot more tickets in the same amount of time.”
The research team states that as you scale to large databases on much larger machines, the number of sockets and physical memory size will also scale, making page-table placement a more significant problem. In large multi-socket machines, data placement policies typically partially replicate a database on different sockets so it can be accessed faster. But this has a high memory overhead, which grows with database size as the machines they run on become larger.
The Mitosis study shows that page-table replication incurs negligible memory overhead, can be implemented efficiently and delivers substantial performance improvements. These gains come at a cost of only 0.6% memory overhead, compared to the exorbitant memory cost of data replication.
HOW MITOSIS WORKS
Mitosis has two components, a mechanism to enable efficient page-table replication and migration; and policies to effectively manage and control page-table replication and migration.
The illustration shows how Mitosis can replicate the page tables on each socket where a multi-socket application is running. Currently, an address translation can result in up to four remote accesses to page-tables. However, with Mitosis-based replication, an address translation results in up to four local accesses to the page-table, precluding the need for any remote memory accesses during page-table walks.
Additionally, single-socket workloads suffer performance losses when processes are migrated across sockets while page-tables are not. The illustration shows that, when a process is migrated from socket 0 to socket 1, the NUMA memory manager transparently migrates data pages, but page-table pages remain on socket 1. In contrast, Mitosis migrates the page-tables along with the data. This eliminates remote memory accesses for page-table walks, improving performance significantly.
Mitosis builds on widely-used operating system mechanisms like page faults and system calls and applies to most commodity operating systems. An important feature of Mitosis is that it requires no changes to applications or hardware, and is easy to use on a per-application basis.
The Mitosis research team includes VMware Research interns Reto Achermann and Ashish Panwar, and academic collaborators: Abhishek Bhattacharjee (Yale University), Timothy Roscoe (ETH Zurich), Arkaprava Basu (IISc Bangalore), and Gopinath Kanchi (IISc Bangalore).
The research team’s next step will be to approach virtualization in the VMware hypervisor. Understanding page-table placement in virtualized systems is a major undertaking and will require additional research and a separate study.
The research team has released Mitosis for native systems. It is open-sourced and available for use here.