models qwen3 embedding 8b generic cpu - Azure/azureml-assets GitHub Wiki
This is the CPU-optimized variant of qwen3-embedding-8b, a text embedding model from the Qwen3 family developed by Alibaba Cloud and optimized by Microsoft.
- Model Type: Text Embedding (ONNX)
- Parameters: 8 billion
- Context Length: 32K tokens
- Embedding Dimension: Up to 4096
- Quantization: KLD Gradient quantization
- Target Device: CPU
- Execution Provider: CPUExecutionProvider
- Supported Languages: 100+
This model is optimized for local execution on devices with CPU hardware acceleration using Foundry Local.
- Text retrieval and semantic search
- Code retrieval
- Text classification and clustering
- Bitext mining
- Multilingual and cross-lingual retrieval
This model is licensed under Apache 2.0. See license details.
- HuggingFace: Qwen3-Embedding-8B
Version: 1
foundryLocal : test license : apache-2.0 licenseDescription : This model is provided under the License Terms available at https://huggingface.co/Qwen/Qwen3-Embedding-8B/blob/main/LICENSE author : Microsoft inputModalities : text outputModalities : text task : embeddings maxOutputTokens : 1 alias : qwen3-embedding-8b directoryPath : v1 promptTemplate supportsToolCalling : false capabilities : embedding supportsReasoning : false reasoningStart reasoningEnd contextLength : 32768 minFLVersion : 0.0.0 disable-maap : true
View in Studio: https://ml.azure.com/registries/azureml/models/qwen3-embedding-8b-generic-cpu/version/1
License: apache-2.0