In this scenario, we need to upgrade the NVIDIA vib. When updating NVIDIA VIBs from Driver Version: 450.191 to version 450.236.03 on ESXi 7.0 Update 3u.x, I ran into an error message: “vmkload_mod: Can not remove module nvidia: module symbols in use”
To work around this error message I needed to execute an additional step that was missing from the documentation that VMware provided. I was able to find this step on a blog from Tony Harmelink, thanks for that!
Note: The solution to bypass the error was to stop the nvidia-init service by running the command: “/etc/init.d/nvidia-init stop”
This is a short guide, for installing an update for NVIDIA on ESXi 7.0.
Uninstall and install NVIDIA Driver
Before updating the NVIDIA vib, put the host in Maintenance Mode.
Note: Keep in mind that there is sufficient GPU available, to move all the VMs to a different host in the Cluster.
Once Maintenance Mode is complete SSH into ESXi CLI. You can use Putty to SSH onto the ESXi host.
- Check the driver version
Uninstall the vib using the following steps:
- Stop the xorg service
- /etc/init.d/xorg stop
- Stop the nvidia-init service by running the command
- /etc/init.d/nvidia-init stop
- Remove the NVIDIA VMkernel driver by running the command
- vmkload_mod -u nvidia
- Identify the NVIDIA VIB name by running this command
- esxcli software vib list | grep NVIDIA
- Remove the VIB by running the command
- esxcli software vib remove -n NAMEOFNVIDIAVIB
- Log on to WINSCP and connect to the ESXi host
- Upload the package to /tmp
- Install the NVIDIA vib
- esxcli software vib install -v /path_to_vib/nvidia_vib
- Start the nvidia-init service by running the command
- /etc/init.d/nvidia-init start
- Confirm driver is updated, running, and seeing the GPUs by running the command
Output will show the driver version and the number of GPUs present.
That’s all folks!
Please get in touch with me or leave a comment, if you have any questions or want more information on this or other topics.