qwen3-embedding-0.6b-cuda-gpu

Overview

Qwen3 Embedding 0.6B Cuda Gpu

This is the GPU (NVIDIA CUDA)-optimized variant of qwen3-embedding-0.6b, a text embedding model from the Qwen3 family developed by Alibaba Cloud and optimized by Microsoft.

Model Details

Model Type: Text Embedding (ONNX)
Parameters: 0.6 billion
Context Length: 32K tokens
Embedding Dimension: Up to 1024
Quantization: KLD Gradient quantization
Target Device: GPU (NVIDIA CUDA)
Execution Provider: CUDAExecutionProvider
Supported Languages: 100+

Intended Use

This model is optimized for local execution on devices with GPU (NVIDIA CUDA) hardware acceleration using Foundry Local.

Capabilities

Text retrieval and semantic search
Code retrieval
Text classification and clustering
Bitext mining
Multilingual and cross-lingual retrieval

License

This model is licensed under Apache 2.0. See license details.

Source

HuggingFace: Qwen3-Embedding-0.6B

Version: 1

models qwen3 embedding 0.6b cuda gpu - Azure/azureml-assets GitHub Wiki

qwen3-embedding-0.6b-cuda-gpu

Overview

Qwen3 Embedding 0.6B Cuda Gpu

Model Details

Intended Use

Capabilities

License

Source

Tags

⚠️ GitHub.com Fallback ⚠️

models qwen3 embedding 0.6b cuda gpu - Azure/azureml-assets GitHub Wiki

qwen3-embedding-0.6b-cuda-gpu

Overview

Qwen3 Embedding 0.6B Cuda Gpu

Model Details

Intended Use

Capabilities

License

Source

Tags

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️