ray_intro - OpenNebula/one-apps GitHub Wiki

Overview

Ray is an open-source framework for distributed computing and machine learning workloads. This appliance is tailored to leverage Ray’s Serve library, enabling efficient deployment of inference APIs. Additionally the appliance includes the vLLM library for fast inference and serving.

The appliance provides a streamlined solution for building and serving end-to-end AI applications, utilizing pre-trained models from the Hugging Face Transformers library.

Download

The latest version of the Ray appliance is available for download from the OpenNebula public Marketplace:

Service Ray

Requirements

Minimum requirements vary depending on the selected LLM and its size. We're currently developing hardware recommendations for each available model. However, to ensure optimal performance with even the smallest model, we recommend provisioning a virtual machine with at least 8 GB of RAM and a GPU with a minimum of 14 GB of vRAM.

Release Notes

Detailed release notes for each version are available on the OpenNebula release page, providing comprehensive insights into version-specific updates. The Ray appliance is based on Ubuntu 24.04 LTS (x86-64).

Component	Version
Ray	2.44.1
vllm	0.8.5

Next: Ray Quick Start