Linux system configuration NVIDIA graphics driver and CUDA and CUDNN

Preface

Deploy a GPU computing server for private AI model training. This deployment uses the Ubuntu system as an example.

Name	Version	Arch
Ubuntu	22.04	x86_64
NVIDIA Drivers	570.124.06	x86_64
CUDA11	520.61.05	x86_64
CUDNN	9.8.0.87	x86_64

⚠️ 注意 Before configuring the service, please check for compatibility issues between versions, otherwise various errors may occur when deploying the training environment!

NVIDIA graphics card driver download CUDA driver version download list CUDNN library version download list

Prepare Ubuntu to install NVIDIA graphics card environment

2.1 Install system-based dependencies

1
koevn@localhost:~$ sudo apt install -y build-essential dracut-core linux-headers-$(uname -r)

2.2 Check if Linux recognizes the NVIDIA graphics card

1
koevn@localhost:~$ sudo lspci | grep -i nvidia
2
03:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)

2.3 Check if Linux Nouveau is disabled

1
koevn@localhost:~$ sudo lsmod | grep nouveau
2
nouveau              2306048  0
3
mxm_wmi                16384  1 nouveau
4
i2c_algo_bit           16384  1 nouveau
5
drm_ttm_helper         16384  1 nouveau
6
ttm                    86016  3 vmwgfx,drm_ttm_helper,nouveau
7
drm_kms_helper        311296  2 vmwgfx,nouveau
8
video                  65536  1 nouveau
9
wmi                    32768  2 mxm_wmi,nouveau
10
drm                   622592  7 vmwgfx,drm_kms_helper,drm_ttm_helper,ttm,nouveau

If the above information is displayed, it means that the system nouveau is loading. Perform the following operations to disable nouveau

1
koevn@localhost:~$ sudo cat > /etc/modprobe.d/blacklist-nouveau.conf << EOF
2
blacklist nouveau
3
options nouveau modset=0
4
EOF
5
koevn@localhost:~$ sudo dracut --force
6
koevn@localhost:~$ sudo reboot

The reason why we need to disable system nouveau here is because we want to install the official driver provided by NVIDIA, which is closed source, while nouveau is open source. If it is not disabled, the Linux system will load nouveau by default, which will cause a conflict between the two drivers and cause strange problems.

After the system restarts, run the sudo lsmod | grep nouveau command to check if there is any output. If not, the system is complete.

Install NVIDIA Driver

Upload the downloaded NVIDIA driver package to Linux and then install it

1
koevn@localhost:~$ cd /tmp
2
koevn@localhost:/tmp$ sudo chmod +x NVIDIA-Linux-x86_64-570.124.06.run
3
koevn@localhost:/tmp$ sudo ./NVIDIA-Linux-x86_64-570.124.06.run -no-opengl-files -no-nouveau-check

-no-opengl-files: Do not use the OpenGL dynamic library provided by NVIDIA because the system used is not GUI

-no-nouveau-check: Skip nouveau check Verify that the NVIDIA driver is installed successfully

1
koevn@localhost:~$ sudo nvidia-smi
2
Tue Apr  8 16:12:06 2025
3
+-----------------------------------------------------------------------------------------+
4
| NVIDIA-SMI 570.124.06             Driver Version: 570.124.06     CUDA Version: 12.8     |
5
|-----------------------------------------+------------------------+----------------------+
6
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
7
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
8
|                                         |                        |               MIG M. |
9
|=========================================+========================+======================|
10
|   0  Tesla T4                       Off |   00000000:03:00.0 Off |                    0 |
11
| N/A   50C    P0             25W /   70W |       1MiB /  15360MiB |      9%      Default |
12
|                                         |                        |                  N/A |
13
+-----------------------------------------+------------------------+----------------------+
14

15
+-----------------------------------------------------------------------------------------+
16
| Processes:                                                                              |
17
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
18
|        ID   ID                                                               Usage      |
19
|=========================================================================================|
20
|  No running processes found                                                             |
21
+-----------------------------------------------------------------------------------------+

Install CUDA

According to the CUDA driver version download list，select the system version and architecture, select Download > Download the installation package with the installation type of runfile(local), upload it to Linux and install it.

1
koevn@localhost:/tmp$ wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
2
koevn@localhost:/tmp$ sudo chmod +x cuda_11.8.0_520.61.05_linux.run
3
koevn@localhost:/tmp$ sudo ./cuda_11.8.0_520.61.05_linux.run --no-opengl-libs --toolkit

CUDA Installation Steps

⚠️ 注意 Since the NVIDIA graphics driver has been installed before, press the space bar in this step to deselect the graphics driver installation, and then select install

The installation is complete. Configure the system environment variables according to the prompts

1
koevn@localhost:~$ sudo cat > /etc/profile.d/cuda.sh << EOF
2
export PATH=/usr/local/cuda-11.8/bin:$PATH
3
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH
4
EOF

Verify that CUDA is installed successfully

1
koevn@localhost:~$ sudo nvcc -V
2
nvcc: NVIDIA (R) Cuda compiler driver
3
Copyright (c) 2005-2022 NVIDIA Corporation
4
Built on Wed_Sep_21_10:33:58_PDT_2022
5
Cuda compilation tools, release 11.8, V11.8.89
6
Build cuda_11.8.r11.8/compiler.31833905_0

Add CUDNN

Download the corresponding cudnn version and upload it to Linux, and do the following

1
koevn@localhost:/tmp$ tar -xvf cudnn-linux-x86_64-9.8.0.87_cuda11-archive.tar.xz
2
koevn@localhost:/tmp$ mv cudnn-linux-x86_64-9.8.0.87_cuda11-archive cudnn
3
koevn@localhost:/tmp$ cd cudnn
4
koevn@localhost:/tmp/cudnn$ sudo cp lib/* /usr/local/cuda-11.8/lib64/
5
koevn@localhost:/tmp/cudnn$ sudo cp include/* /usr/local/cuda-11.8/include/
6
koevn@localhost:/tmp/cudnn$ sudo chmod a+r /usr/local/cuda-11.8/lib64/*
7
koevn@localhost:/tmp/cudnn$ sudo chmod a+r /usr/local/cuda-11.8/include/*

Verify CUDNN version

1
koevn@localhost:/tmp/cudnn$ sudo cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
2
#define CUDNN_MAJOR 9
3
#define CUDNN_MINOR 8
4
#define CUDNN_PATCHLEVEL 0
5
--
6
#define CUDNN_VERSION (CUDNN_MAJOR * 10000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
7

8
/* cannot use constexpr here since this is a C-only file */

That’s it!

Linux system configuration NVIDIA graphics driver and CUDA and CUDNN

https://huoshen.pages.dev/p/e7d6744d/

Author

Koevn

Published at

April 8, 2025

License

CC BY-NC-SA 4.0