Kernel module compatibility has been a problem for allowing kernel modding by kernel module. One typical example of kernel modding is overclocking. The mod I concern at this moment is to support the features of NoDock.
Previous effort has been done to research in this issue and found out that the vermagic of a ko can be patched to equal to the target kernel to allow passing the init_module syscall check for vermagic. And CONFIG_MODVERSIONS was set in the source kernel to include the CRC of imported symbol so that it would pass the modversion check of most kernels of the same 2.x.y(eg. 2.6.32) version.
For kernels not built with CONFIG_MODVERSIONS, simply patching vermagic of the ko would allow the ko to pass the check during insmod. The only pitfall is that if the struct module of the source kernel is smaller, memory corruption may cause kernel panic. And if the .init or .exit offset is shifted, the initialization and cleanup routine of the module may not be visible thus not working perfectly.
When a kernel has been built with CONFIG_MODVERSIONS, CRC mismatch of any imported symbol will cause the module failed to be loaded. This can prevent kernel panic but also decrease the portability of the ko. For instance, Droid X2 has a different module_layout than the Droid in 2.6.32.* kernel. This has prevented the workaround described above and introduced one more dimension of variant that would result in additional cost in the ko building process.
It is desirable to reduce the build complexity from O(n) to O(1) such that only one ko is ever needed to be built for all different n kernels. Thus, the module compilation process has been reviewed and find out that the change of the struct module can be tackled by simple patching.
There is a section called .gnu.linkonce.this_module in the ko that store the struct module. If the size of this section is smaller than the struct module of the target kernel, it may cause out of buffer access(and possibly kernel panic). Moreover, the relocation table of this section have two entries .init and .exit, they will be invoked during module initialization and cleanup. Thus if these offsets are shifted between the source and target kernel, the .init and .exit won't be visible to the kernel thus the kernel module won't be initialized properly.
A workaround is to take a healthy ko, inspect these info and patch to the ko that want to be loaded.
After the discussion in #[mbm] at irc.freenode.net, [mbm] has given some insights:
1. No need to patch the kernel for the check_version, simply patch ko's module_layout crc instead;
2. The patching of ko can use mmap and call init_module to save some write cycles.
The result is satisfactory that made kernel module loading reached a high compatibility between difference kernel builds given that these rules are strictly obeyed:
All symbols in the dynamic relocation table are resolvable in the target kernel.
Direct kernel structure member accesses should be restricted to avoid different structure layout affected by the ifdef config. The worst result will be panic or structure corruption.
Post-pone function lookup at runtime to allow graceful degradation.
It was a long time ago since this idea has been proven to work. Comparing to the initial method, this latest method have the following points to note:
/proc/kmem is no longer needed, /proc/kallsyms, which seems to be have higher availability is needed on the other hand.
Since the struct module can varies for different builds even for the same kernel version due to different configs, an existing loadable kernel module on the target system is needed to cater this issue. Thus we could fix the init and exit offset and copy the vermagic strings from there. This is because some custom kernel won't dump the correct vermagic string if insmod failed because of vermagic string mismatch.
Have modpost.c patched to enlarge the struct module to allow buffer for different configs.
A loader binary will first load a very small kernel module that has just one objective - hack the check_version kernel function. This will most likely to work as no kernel function is imported. Afterward, the CRC checking of the real kernel module will pass. Finally, remove the small kernel module to undo the effect.
The latest state of this method is that a kernel module failed to be loaded on Linux kernel 3.0.x due to the module parameter something. It is possible that the module parameter has something changed from 2.6.x to 3.x.