Thursday, April 13, 2017

A Threesome on the Linux Kernel: Intel OpenCL r4.0, VirtualBox 5.0.18 and aufs4 (Docker 1.12.3), all on Kernel 4.7.0-040700-generic

Since I have a small GPU on my HP Envy 15t-ae100 running Ubuntu 16.04, the idea of getting it set up as a GPGPU device has been lingering in my to-do list for quite some time. While Intel's website claimed that it could be done with an OpenCL compatibility kernel patch, I was a bit reluctant to follow it in fear of breaking my hard-built system, containing applications and projects set up for my workplace environment, set up over a course of several months.

Finally, one weekend I made a decision to try it all out. After making a backup of all my data, I went ahead and started off with Intel's PDF guide. Happy to say, everything went perfectly and smoothly; there were no detours, and all instructions worked like a charm.

Once I rebooted after installing the patched kernel and fired up BOINC Manager, it happily reported 1 OpenCL GPU (yay!) and 1 OpenCL CPU with 0.00589 compute units, meaning that the real power of the GPU has been exposed by the OpenCL platform :). In addition, I also tried the OpenCL capability reporter: ZIP source and GPU quicksort sample OpenCL programs (both provided by Intel) and they too worked perfectly, reporting a GPU with 24 cores, supporting OpenCL 2.0 Full Profile.

All was going fine until I fired up VirtualBox and tried to power up a VM. The VM failed to start and I was greeted with an error indicating failure of the vboxdrv module. sudo dpkg-reconfigure virtualbox, suggested by online sources, also failed to rectify the issue, as compiling of the kernel module was failing for the newly installed 4.7.0 kernel. Evidently the installed source of my VirtualBox version (5.0.18) was somehow incompatible with the new kernel.

Problems continued to pile up. The wireless module (bcmwl-kernel-source) that I had installed earlier, refused to work under the new kernel. The wireless interface was no longer working, and there was no way to connect to the workplace Wi-Fi network.

As I work frequently with Docker and K8s, I naturally got a feeling that I should be checking up on them as well. I discovered that Docker was failing to start, spitting out an error [graphdriver] prior storage driver "aufs" failed, rendering the K8s stack was pretty much useless (for dev testing I use a full stack of Docker and K8s (both master and minion) on my machine).

Luckily I had not purged my original kernel, and as a result, I could get the Wi-Fi and Docker issues resolved by restarting the system with the old kernel (via manual selection using Advanced options for Ubuntu entry on the GRUB menu). The VirtualBox issue persisted until I performed another sudo dpkg-reconfigure virtualbox which recompiled the vboxdrv module for the older kernel (still failing on the newer one but getting successfully installed for the older one).

Once I got myself up and running with the old kernel, I began investigating the issues one by one. The VirtualBox issue was easy, as a patch had already been provided by a generous member of the dev team soon after the 4.7.0 kernel release. I simply had to locate the vboxdrv kernel source location using the log generated during dpkg-reconfigure, apply the patch on top of it, and issue another dpkg-reconfigure to see the module getting compiled successfully for both installed kernels.

The wireless issue was also fairly easy to solve, thanks to this AskUbuntu post. I just had to replace the bcmwl-kernel-source module with broadcom-sta-dkms 6.30.223.271-3, and things started working smoothly again (it even appeared that the wireless "stability" had increased, as some of the connectivity issues I had been consistently experiencing at certain locations seemed to have disappeared).

The Docker issue seemed to be a little bit trickier. While posts like this GitHub issue were placing the blame on missing linux-image-extra packages, I could not find such packages for my 4.7.0 kernel. After a fair amount of wandering around, I learned that separate linux-image-extra packages had not been released for 4.7.0.

The word aufs on the Docker startup error led me to search further on aufs, which carried me over to this guide on unofficially adding it to the kernel as a patch. However, I was not very confident with the approach as I had to be applying the patched on top of an already-patched source (with OpenCL-related changes). Anyway, I made up my mind in the end:

  • cloning aufs-util with --depth 1 and -b aufs4.1 (closest applicable version for kernel 4.7)
  • cloning aufs4-standalone with --depth 1 and -b aufs4.7
  • patching the 4.7 kernel source (which had already been patched for Intel OpenCL and VirtualBox), following instructions under method 1 of section 3 (Configuration and Compilation) of the guide
  • rebuilding and reinstalling the kernel

And it worked!

Now I am happily using VirtualBox, Docker and OpenCL with GPU support, all on my own machine.

No comments: