nim_feature - OpenNebula/one-apps GitHub Wiki

Features and Usage

This appliance runs an NVIDIA NIM container inside an OpenNebula service VM. It uses deployment-time contextualization to authenticate to the configured registry, pull the selected image, and expose the NIM API over the network.

Contextualization

The appliance behavior is controlled by contextualization parameters specified in the VM template's Context Section. The parameters below define the registry access and the NIM image to run.

Parameter Default (Marketplace) Description
ONEAPP_NIM_NVIDIA_REGISTRY nvcr.io Required. NVIDIA registry used to pull the NIM image.
ONEAPP_NIM_NVIDIA_REGISTRY_USER $oauthtoken Conditionally required. Needed only when the registry is not nvcr.io.
ONEAPP_NIM_NVIDIA_REGISTRY_KEY Required. Registry API key used for authentication.
ONEAPP_NIM_NVIDIA_IMAGE_REF nvcr.io/nim/openai/gpt-oss-20b:latest Required. Full registry image reference of the NIM container to run.

Note: The values shown in the Default (Marketplace) column come from the marketplace template metadata. They are template defaults, not appliance-enforced defaults.

Registry Authentication

The appliance authenticates to the configured registry with docker login, using ONEAPP_NIM_NVIDIA_REGISTRY_KEY as the password.

When ONEAPP_NIM_NVIDIA_REGISTRY="nvcr.io", the appliance uses $oauthtoken as the username. For other registries, you must provide ONEAPP_NIM_NVIDIA_REGISTRY_USER.

For more details on NVIDIA registry authentication, see the NVIDIA documentation on NGC API keys.

Runtime Behavior

On deployment, the appliance:

  • Starts Docker if needed.
  • Authenticates to the configured NVIDIA registry.
  • Pulls the image specified in ONEAPP_NIM_NVIDIA_IMAGE_REF.
  • Starts the NIM container with the NVIDIA runtime and all available GPUs.
  • Exposes the NIM API on port 8000.
  • Sets NGC_API_KEY inside the container.
  • Mounts /var/lib/nim/.cache into /opt/nim/.cache.

The appliance image installs the runtime dependencies required to launch the container, including:

  • docker.io
  • nvidia-driver-590-server-open
  • nvidia-utils-590-server
  • nvidia-container-toolkit

IMPORTANT: The appliance is designed to launch the NIM container with --runtime=nvidia --gpus all. In practice, it should be deployed on infrastructure where NVIDIA GPU resources are available to the guest VM.

Endpoints

The appliance exposes the NIM API on port 8000:

  • API base: /v1
  • Readiness endpoint: /v1/health/ready

If OneGate is enabled and reachable during bootstrap, the appliance publishes the following VM attributes:

  • ONEAPP_NIM_API with the value http://<ip>:8000/v1
  • ONEAPP_NIM_HEALTH with the value http://<ip>:8000/v1/health/ready

Readiness is determined by waiting for a Docker healthcheck if the container image provides one. Otherwise, the appliance checks http://localhost:8000/v1/health/ready from inside the VM.

Operational Checks

To verify that the container is running and the NIM API is responding, run the following commands inside the VM:

docker logs -f nim
curl -s http://127.0.0.1:8000/v1/models
curl -fsS http://127.0.0.1:8000/v1/health/ready

Additional inference endpoints may be available depending on the selected NIM image and model.