Skip to main content

Installation

NVIDIA GPU Driver

Open a terminal, execute the following commands to install NVIDIA CPU driver.

sudo apt install nvidia-utils-535
sudo apt install nvidia-driver-535
warning

If you got an error like Unable to locate package nvidia-driver-535. The apt database may outdate. Run sudo apt update to update the apt database to solve this problem.

Now run sudo reboot to reboot the host. After rebooting, execute nvidia-smi command. You should see the information regarding NVIDIA GPU in the output.

NVIDIA SMI

Install NVIDIA Toolkit (CUDA)

In the terminal, execute the following commands to install CUDA, the NVIDIA toolkit.

wget https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-3

Disk Setup (LVM Setting)

To configure the AI SSD for directing AIDE in executing fine-tuning tasks, follow these steps:

  1. Install LVM

    sudo apt update
    sudo apt install lvm2 xfsprogs
  2. Check Disks Location

    lshw -class disk -class storage | grep -E 'ai100|logical name|version: EIFZ'
    lsblk | grep nvme
    info

    Ensure that ai100 device identifiers are nvme6n1 and nvme8n1. Update if necessary

  3. Clear Disks (Just In Case)

    sudo wipefs -a /dev/nvme1n1 /dev/nvme2n1
  4. Create LVM

    sudo pvcreate /dev/nvme1n1 /dev/nvme2n1
    sudo vgcreate ai /dev/nvme1n1 /dev/nvme2n1
    sudo lvcreate --type striped -i 2 -I 128k -l 100%FREE -n ai ai
  5. Mount LVM

    • Format the disk
      sudo mkfs.xfs -f -s size=4k -m crc=0 /dev/ai/ai -f
    • Mount the disk
      sudo mkdir -p /mnt/nvme0
      sudo mount /dev/ai/ai /mnt/nvme0
      sudo chown -R $USER:$USER /mnt/nvme0
  6. Make Mount Persistent

    sudo echo '/dev/ai/ai /mnt/nvme0 xfs defaults,nofail 0 0' | sudo tee -a /etc/fstab
    info

    To remove permanent mount setting, run: sudo sed -i '//dev/ai/ai/d' /etc/fstab

  7. Successful Example If LVM setting is successful, you will see the following successful configuration when running command lsblk. LVM Success

    If you need to dissolve LVM setting. Just run the following commands:

    sudo umount /mnt/nvme0
    sudo lvremove -y ai
    sudo pvremove -y /dev/nvme1n1 /dev/nvme2n1 --force --force
  8. Swap File Setting Enable swap space to provide extra memory for DRAM, allowing you to increase batch sizes if there is sufficient

    • Create swap file
      sudo dd if=/dev/zero of=/mnt/nvme0/swapfile bs=1M count=256k
    • Modify permission
      sudo chmod 0600 /mnt/nvme0/swapfile
    • Initialize swap file
      sudo mkswap /mnt/nvme0/swapfile
    • Enable the swap
      sudo swapon /mnt/nvme0/swapfile
    • Make the swap permanent
      sudo echo '/mnt/nvme0/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

    If you would like to remove the swap, please make sure to follow the steps below to prevent unexpected system issues.

    sudo swapoff /mnt/nvme0/swapfile
    sudo sed -i '//mnt/nvme0/swapfile/d' /etc/fstab
    sudo rm /mnt/nvme0/swapfile

Install Docker

  • Run the following command to uninstall all conflicting packages:

    for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done

    apt-get might report that you have none of these packages installed.

  • Set up Docker's apt repository.

    # Add Docker's official GPG key:
    sudo apt-get update
    sudo apt-get install ca-certificates curl
    sudo install -m 0755 -d /etc/apt/keyrings
    sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
    sudo chmod a+r /etc/apt/keyrings/docker.asc

    # Add the repository to Apt sources:
    echo \
    "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
    $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
    sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
    sudo apt-get update
  • Install the Docker packages.

    sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
  • Add user to docker group.

    sudo usermod -aG docker ACCOUNT

    Here ACCOUNT is the account you are logging into. After this, remember to re-login againg so that your account is a member of docker group.

  • Verify that the installation is successful.

    docker run hello-world

    This command downloads a test image and runs it in a container. When the container runs, it prints a confirmation message and exits.

Install NVIDIA Container Toolkit

  • Configure the production repository.

    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
    && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
  • Update the packages list from the repository.

    sudo apt-get update
  • Install the NVIDIA Container Toolkit packages.

    sudo apt-get install -y nvidia-container-toolkit
  • Configure the container runtime by using the nvidia-ctk command.

    sudo nvidia-ctk runtime configure --runtime=docker
  • Restart the Docker daemon.

    sudo systemctl restart docker

Install GenAI Studio

GenAI Studio make an installer so that users can install it with ease. Normally, what you need to do is to download it and, then, execute it.

info

The GenAI Studio installation file is approximately 30GB. To ensure smooth system installation, we recommend having at least 100GB of free disk space.

Please contact your technical window for the installation file. the named like GenAI-Studio_<VERSIOIN>_setup.run format. Don't forget move the installer you downloaded to the target host if your download does not run on. Finally, just execute the downloaded installer file. Answer the questions during the process. You can tell that's really a simple step.

success

Check the permissions of installer file you downloaded. If it does not have execute permission attached, just change it by chmod 0755 INSTALLER_FILE command.

Starts GenAI Studio Up

If everything goes well the GenAI Studio should be installed under $HOME/Advantech/GenAI-Studio directory. Change your directory to ~/Advantech/GenAI-Studio/bin and run ./app-up. After seconds, open a browser to visit the target host with 3001 port.

info

Before v1.1.0 release, the installation path is $HOME/GenAI-Studio.