Hybrid GPU-Virtualization to Reduce Practical Power Consumption of the VM for Collaborative RDS
Hybrid GPU-Virtualization to Reduce Practical Power Consumption of the VM for Collaborative RDS
1. Introduction
Graphics Processing Units (GPUs) are used for 3D CADs, 3D CGs, 3D Video Games, Virtual Reality Systems, and Machine Learning for AI, as GPUs can perform calculations in parallel faster than CPUs to obtain the results of matrix-matrix multiplication for renderings and other engineering calculations. In commercial cloud computing, the providers dedicated to VDI solutions and GPU time-sharing providers for general purposes are sharing the virtual GPUs (vGPUs). However, these services are mainly for corporate use and the prices are more than 200 USD per month. In the Micro-VPC, my small self-hosted virtual private cloud, I want to provide the virtual GPUs with remote desktop sharing of the Collaborative RDS for individuals. As for the future, I'm considering cloning the Micro-VPC and sharing it as a computer cluster among individuals to protect individual rights from oligopolistic corporations, for example, against oligopoly prices and the commercial use of personal data.
For this project, I first implemented the hybrid/single GPU solution of 'real' personal computers into the virtual machines. Secondly, I measured the power consumption of the Micro-VPC to seek optimal system design in order to share the virtual GPUs, as power consumption is a crucial matter of individual activity. For the third step, I'm planning to control the GPU's power consumption and performance depending on the network speed and several 'use cases' of the GPUs such as 3D CAD/CG and Virtual Reality Systems.
Currently, I have found that the graphics management of 'PRIME GPU Offloading' (See Fig. 1) is beneficial for GPU virtualization and sharing vGPUs with VMs when considering low electricity consumption. Also, GPU virtualization has enhanced the graphics (3D model rendering and video) and improved the interactive response time of the desktop application on Collaborative RDS. I want to report on steps one and two and the benefits obtained by those steps.
2. Hybrid Graphics and Single Graphics
Figure 1 shows three types of system management for hybrid/single graphics of 'real' personal computers and engineering workstations. PRIME, used in hybrid graphics management, is an open-source implementation of Optimus for Nvidia and AMD Dynamic Switchable Graphics for Radeon. In Figure 1, iGPU refers to the integrated GPU in the CPU as a System-on-a-Chip (SoC) such as Intel UHD 630 Graphics in Intel processors, and dGPU refers to discrete GPU provided as a peripheral card such as Nvidia GeForce and AMD Radeon series. Usually, dGPU cards are more powerful and use more electricity than an iGPU chip. Applications of the graphics system differ greatly depending on which GPU is used as the primary GPU.
Configuration of the iGPU as the primary and dGPU as the secondary GPU, RRIME GPU Offloading applied to general-use engineering such as CAD, CG, raytracing calculation in the CG, and Machine Learning for AI. Reverse RRIME, which uses dGPU as the primary GPU, enables full-time 3D rendering all over the screen, for dedicated systems of virtual realities and video games. I configured the two real hybrid types and a single type of graphics management into the virtual machines.
3. Dual PCI-Passthrough for the Hybrid Graphics
To implement hybrid graphics in virtual machines on the Micro-VPC, I used two PCI passthrough technologies, Open Virtual Machine Firmware (OVMF) in the virtualization, which TianoCore Community has been developing, and Intel GVT-g for Intel Core processor families. TianoCore's OVMF technologies can fully control the GPUs from virtual machines that have the virtualized GPUs while the GPUs are inaccessible from the host. Intel GVT-g technologies enable access to the real GPU from the host and other VMs during its virtualization; however, it's only for integrated GPUs in Intel processor families of Generation 11 (code name is Rocket Lake) and earlier generations.
Under the requirement that hybrid graphics configurations need at least two GPU virtualizations, I used OVMF for the dGPU and GVT-g for the iGPU, as the host must use the iGPU. Figure 2 is a diagram showing a combination of the dual GPU passthrough technologies. This combination enables the RRIME GPU Offloading and Reverse PRIME by slightly modifying the configuration without interference between them. Also, single passthroughs of the dGPU and iGPU are available, respectively.
For Single Root IO Virtualization (SR-IOV), which is used for iGPUs built into 12th generation and later Intel processors, there is a driver for specific kernel versions of Linux, i915-sriov-dkms by Strongtz[6]. However, I am facing the problem of Radeon and Nvidia graphics drivers not working with this driver on the kernel. Thus, I have not succeeded with hybrid graphics using the latest Intel processors.
4. Hardware Installation
I used low-end GPUs, AMD Radeon RX 6400 and Nvidia GeForce GTX 1650, for the power consumption measurement as shown in Figure 3. I installed Radeon RX 6400 of PCIe 4.0 x4 interface to a PCIe 3.0 x4 slot, and Geforce GTX 1650 of PCIe 3.0 x16 interface to a PCIe 3.0 x16 slot. The RX 6400’s forward compatibility of the PCIe specification between the PCIe 4.0 socket and the PCIe 3.0 slot degrades the performance of this GPU. However, I confirmed that all the specified functions of the Radeon RX 6400 performed well except for its low benchmark score.
5. Power Consumption Measurement
Figure 4 shows the conditions of the power consumption measurement of the host under the Unigine Heaven Benchmark load test. I created a practical situation where a tester accesses a virtual GPU server in the host of a virtual machine's network, Micro-VPC, via the Internet. I measured the power consumption of the Micro-VPC while executing the Unigine Heaven Benchmark program on the virtual GPU server.
Radeon RX 6400 and GeForce GTX 1650 have a system health monitoring program, 'rocm-smi' and 'nvidia-smi,' respectively. However, these monitoring programs show lower power consumption than measured at the host's power plug. Installing the graphics card to the motherboard's PCIe slot might require more power than the data shown in these health monitoring programs of each graphics card. Thus, I refer to the power consumption measured at the power plug of the Micro-VPC as the actual power consumption.
Table 1 shows the results of the power consumption measurement of Micro-VPC with Radeon RX 6400 and Intel UHD 630 Graphics. PRIME GPU Offloading and Reverse PRIME can select and switch dGPU and iGPU to run the Unigine Heaven 3D graphic program. Also, the two graphics management systems can decrease the power consumption using the program while running the 3D video program without any interruption. The significant benefit found in the measurement is the inactive-state power consumption of PRIME GPU Offloading, 30W through 40W. The inactive-state power consumption of the host without dGPU is 30W through 35W. Thus, installing the graphics card to the motherboard's PCIe slot consumes around 5W. During the inactive state, the condition of the dGPU reported by the virtual GPU server is "D3hot," which means that “the power is mostly removed from the device but not from the computer as a whole.”
In contrast, the status "D0" reported by the virtual GPU server means the GPU is fully powered. Therefore, an additional 5W may be required by installing the dGPU card on the motherboard. Micro-VPC uses the low-power CPU, Intel Core i9-10900’T,’ which has a TDP of 35W, and no peripheral devices that consume high power, such as hard disks. Usual desktop personal computers consume at least 65W. The result that power consumption with the dGPU, Radeon RX 6400, was kept from 30 to 40W can be seen as maintaining low power consumption characteristics of Micro-VPC, considering the 65W of the usual desktop computers. The power consumption results with the dGPU, Radeon RX 6400, which consumed between 30W to 40W, can be seen as maintaining low power consumption characteristics of the Micro-VPC, especially considering the usual 65W used by desktop computers.
Table 2 shows the results of the power consumption measurement of Micro-VPC with GeForce GTX 1650 and Intel UHD 630 Graphics. Currently, the GeForce GTX 1650 can not be detached from the motherboard electrically during the passthrough as can be done with the Radeon RX 6400. I am seeking the reason for the unavailability of power down of the dGPU, Nvidia GTX 1650, while in the inactive state. I am trying to figure out the reason why the dGPU (Nvidia GTX 1650) will not turn off while in the inactive state.
6. Practical Power Consumption
I originally designed Micro-VPC with low power consumption. Implementing a vGPU to improve the graphics performance of Micro-VPC will increase power consumption, at least when the vGPU is in use. However, Micro-VPC runs continuously 24 hours a day, 365 days a year, but does not always use vGPUs. Therefore, power consumption should be reduced as much as possible when the GPU is not in use. I suspended the AMD Radeon RX 6400 GPU when Micro-VPC did not use the GPU, to reduce the overall power consumption. I also clocked down the Micoro-VPC's CPU (Intel Core i9-14900T) by 15%. As a result, the power consumption of the entire Micro-VPC was 35.9W when the GPU was suspended. (Note: Desktop computers consume more than 65W.)
When idling and suspending, the Nvidia Geforce GTX 1650 consumes more power than 20% of the AMD Radeon RX 6400. I decided to use Radeon's GPUs for the Collaborative RDS, as Nvidia's GPUs are unsuitable for the low-power concept of the Micro-VPC, at least with the current driver of Nvidia.
References