Originally, Ubuntu Server had already installed the NVIDIA graphics driver, and executing nvidia-smi showed that the status was normal. After installing the CUDA driver, I executed nvidia-smi to check the status, and this prompt appeared.

Terminal window
root@localhost:~# nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

I thought the error was caused by the system not recognizing the graphics card, so I checked the PCI information.

Terminal window
root@localhost:~# lspci | grep -i nvidia
0b:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)

If the graphics card device is still there, then there is a problem with the driver. In this case, use dkms to compile and install the nvidia driver.

Dynamic Kernel Module Support (DKMS) is a program/framework that enables generating Linux kernel modules whose sources generally reside outside the kernel source tree. The concept is to have DKMS modules automatically rebuilt when a new kernel is installed.

—— From Wikipedia

Install dkms

Terminal window
root@localhost:~# apt-get install dkms

Check the NVIDIA driver version

Terminal window
root@localhost:~# ls /usr/src | grep nvidia
nvidia-550.25.65

Execute dkms to compile and install the NVIDIA driver module

Terminal window
root@localhost:~# dkms install -m nvidia -v 550.25.65
/bin/bash: /usr/local/anaconda/lib/libtinfo.so.6: no version information available (required by /bin/bash)
Creating symlink /var/lib/dkms/nvidia/550.25.65/source -> /usr/src/nvidia-550.25.65
Kernel preparation unnecessary for this kernel. Skipping...
Building module:
cleaning build area...
'make' -j8 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.15.0-131-generic modules.....................
cleaning build area...
nvidia.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.15.0-131-generic/updates/dkms/
nvidia-uvm.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.15.0-131-generic/updates/dkms/
nvidia-modeset.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.15.0-131-generic/updates/dkms/
nvidia-drm.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.15.0-131-generic/updates/dkms/
nvidia-peermem.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.15.0-131-generic/updates/dkms/
depmod....
root@localhost:~#

View NVIDIA driver information

Terminal window
root@localhost:~# nvidia-smi
Thu Feb 20 15:11:42 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.25.65 Driver Version: 550.25.65 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla T4 Off | 00000000:0B:00.0 Off | 0 |
| N/A 56C P0 26W / 70W | 1MiB / 15360MiB | 9% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+

Displays normally, perfect!