# ix-device-plugin **Repository Path**: deep-spark/ix-device-plugin ## Basic Information - **Project Name**: ix-device-plugin - **Description**: IX device plugin是针对天数智芯GPGPU开发的Kubernetes集群设备扩展插件,通过DaemonSet的方式部署到集群,使能Kubernetes集群调度天数智芯GPGPU资源。 - **Primary Language**: Go - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 11 - **Forks**: 7 - **Created**: 2024-03-29 - **Last Updated**: 2025-11-27 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # IX device plugin for Kubernetes ## Table of Contents - [About](#about) - [Prerequisites](#prerequisites) - [Building the IX device plugin](#building-the-ix-device-plugin) - [Configuring the IX device plugin](#configuring-the-ix-device-plugin) - [Enabling GPU Support in Kubernetes](#enabling-gpu-support-in-kubernetes) - [Running GPU Jobs](#running-gpu-jobs) - [Split GPU Board to Multiple GPU Devices](#split-gpu-board-to-multiple-gpu-devices) - [Shared Access to GPUs](#shared-access-to-gpus) ## About The IX device plugin for Kubernetes is a Daemonset that allows you to automatically: - Expose the number of GPUs on each nodes of your cluster - Keep track of the health of your GPUs - Run GPU enabled containers in your Kubernetes cluster. ## Prerequisites The list of prerequisites for running the IX device plugin is described below: * Iluvatar driver and software stack >= v1.1.0 * Kubernetes version >= 1.10 ## Building the IX device plugin ```shell make all ``` This will build the ix-device-plugin binary and ix-device-plugin image, see logging for more details. ## Configuring the IX device plugin The IX device plugin has a number of options that can be configured for it. ```yaml # check ix-device-plugin.yaml apiVersion: v1 kind: ConfigMap data: ix-config: |- resourceName: "iluvatar.com/gpu" flags: splitboard: false usevolcano: false reset_gpu: false ``` | `Field`| `Type ` | `Description` | |--------|------------------------------|------------------| | `flags.splitboard` | boolean | Split GPU devices in every board(eg.BI-V150) if `splitboard` is `true`| | `flags.usevolcano` | boolean | Enable Volcano integration (Use ix-device-plugin with ix-volcano-plugin)| | `flags.reset_gpu` | boolean | Enable Gpu reset| ## Helm Install ### Values | Parameter | Default | Description | | --------------------------- | ------------------ | ------------------------------- | | `image.repository` | `ix-device-plugin` | Image repository | | `image.tag` | `` | Image tag | | `image.pullPolicy` | `IfNotPresent` | Image pull policy | | `ixConfig.flags.splitboard` | `false` | Enable splitboard mode | | `ixConfig.flags.usevolcano` | `false` | Enable Volcano integration | | `ixConfig.flags.reset_gpu` | `false` | Enable GPU reset functionality | ### Example #### Install with Custom Image ```bash helm install ix-device-plugin ix-device-plugin-4.3.0.tgz \ --set image.repository=registry.local/ix-device-plugin \ --set image.tag=test \ --set image.pullPolicy=Always \ -n kube-system ``` #### Install with Volcano plugin You can install the `ix-device-plugin` chart in two modes: **with Volcano plugin enabled** or **without Volcano**. Enable the `usevolcano` flag: ```bash helm install ix-device-plugin ix-device-plugin-4.3.0.tgz \ --set ixConfig.flags.usevolcano=true \ -n kube-system ``` ## Enabling GPU Support in Kubernetes Once you have configured the options above on all the GPU nodes in your cluster, you can enable GPU support by deploying the following Daemonset: ```yaml # ix-device-plugin.yaml apiVersion: apps/v1 kind: DaemonSet metadata: name: iluvatar-device-plugin namespace: kube-system labels: app.kubernetes.io/name: iluvatar-device-plugin spec: selector: matchLabels: app.kubernetes.io/name: iluvatar-device-plugin template: metadata: annotations: scheduler.alpha.kubernetes.io/critical-pod: "" labels: app.kubernetes.io/name: iluvatar-device-plugin spec: priorityClassName: "system-node-critical" securityContext: null containers: - name: iluvatar-device-plugin securityContext: capabilities: drop: - ALL privileged: true image: "ix-device-plugin:4.3.0" imagePullPolicy: IfNotPresent livenessProbe: exec: command: - ls - /var/lib/kubelet/device-plugins/iluvatar-gpu.sock periodSeconds: 5 startupProbe: exec: command: - ls - /var/lib/kubelet/device-plugins/iluvatar-gpu.sock periodSeconds: 5 resources: {} volumeMounts: - mountPath: /var/lib/kubelet/device-plugins name: device-plugin - mountPath: /run/udev name: udev-ctl readOnly: true - mountPath: /sys name: sys readOnly: true - mountPath: /dev name: dev - name: ixc mountPath: /ixconfig volumes: - hostPath: path: /var/lib/kubelet/device-plugins name: device-plugin - hostPath: path: /run/udev name: udev-ctl - hostPath: path: /sys name: sys - hostPath: path: /etc/udev/ name: udev-etc - hostPath: path: /dev name: dev - name: ixc configMap: name: ix-config ``` ```shell kubectl create -f ix-device-plugin.yaml ``` ## Running GPU Jobs GPU can be exposed to a pod by adding `iluvatar.com/gpu` to the pod definition, and you can restrict the GPU resource by adding `resources.limits` to the pod definition. ```yaml $ cat < Driver Version: CUDA Version: | |-------------------------------+----------------------+----------------------| | GPU Name | Bus-Id | Clock-SM Clock-Mem | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Iluvatar BI-V150S | 00000000:8A:00.0 | 500MHz 1600MHz | | 0% 33C P0 N/A / N/A | 114MiB / 32768MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Process name Usage(MiB) | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ ``` ## Split GPU Board to Multiple GPU Devices The IX device plugin allows splitting one GPU board into multiple GPU Devices through a set of extended options in its configuration file. ### With SplitBoard The extended options for splitting board can be seen below: ```yaml flags: splitboard: false ``` That is, `flags.splitboard`, a boolean flag can now be specified. If this flag is set to true, the plugin will split the GPU board into multiple GPUs and kubelet will advertise multiple `iluvatar.com/gpu` resources to Kubernetes instead of 1 for one GPU board. Otherwise, the plugin will advertise only 1 `iluvatar.com/gpu` resource for one GPU board. For example: ```yaml flags: splitboard: true ``` If this configuration were applied to a node with 1 GPUs(eg. Bi-V150, which has 2 GPU chips on it) on it, the plugin would now advertise 2 `iluvatar.com/gpu` resources to Kubernetes instead of 1. ``` $ kubectl describe node ... Capacity: iluvatar.com/gpu: 2 ... ``` ## Shared Access to GPUs The IX device plugin allows oversubscription of GPUs through a set of extended options in its configuration file. ### With Time-Slicing The extended options for sharing using time-slicing can be seen below: ```yaml sharing: timeSlicing: replicas: ... ``` That is, `sharing.timeSlicing.replicas`, a number of replicas can now be specified. These replicas represent the number of shared accesses that will be granted for a GPU. For example: ```yaml flags: splitboard: false sharing: timeSlicing: replicas: 4 ``` If this configuration were applied to a node with 2 GPUs on it, the plugin would now advertise 8 `iluvatar.com/gpu` resources to Kubernetes instead of 2. ``` $ kubectl describe node ... Capacity: iluvatar.com/gpu: 8 ... ```