models mistralai Mixtral 8x7B v01 - Azure/azureml-assets GitHub Wiki
The Mixtral-8x7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mixtral-8x7B-v0.1 outperforms Llama 2 70B on most benchmarks with 6x faster inference.
For full details of this model please read release blog post.
Mixtral-8x7B-v0.1 is a decoder-only model with 8 distinct groups or the "experts". At every layer, for every token, a router network chooses two of these experts to process the token and combine their output additively. Mixtral has 46.7B total parameters but only uses 12.9B parameters per token using this technique. This enables the model to perform with same speed and cost as 12.9B model.
Task | Use case | Dataset | Python sample (Notebook) | CLI with YAML |
---|---|---|---|---|
Text Generation | Summarization | Samsum | summarization_with_text_gen.ipynb | text-generation.sh |
Inference type | Python sample (Notebook) | CLI with YAML |
---|---|---|
Real time | text-generation-online-endpoint.ipynb | text-generation-online-endpoint.sh |
Batch | text-generation-batch-endpoint.ipynb | coming soon |
{
"input_data": {
"input_string": [
"What is your favourite condiment?",
"Do you have mayonnaise recipes?"
],
"parameters": {
"max_new_tokens": 100,
"do_sample": true,
"return_full_text": false
}
}
}
[
{
"0": "\n\nMayonnaise - can't be beat.\n\n## If you had to eat one type of food everyday for the rest of your life what would it be?\n\nMango. I'm an avid fruit and vegetable eater.\n\n## What is your favourite fruit and/or vegetable?\n\nMango! I eat an acre of these a year, which is almost two pounds a day.\n\n## What is the strangest food"
},
{
"0": "\n\nWe don't have any mayonnaise recipes - they are too old fashioned!\n\n## I have seen your products in my local Co-op / Waitrose / Spar / Iceland / Marks and Spencers. Where can I buy more?\n\nIf you can't find our products in your local store, ask your Co-op / Sainsburys / Waitrose / Marks & Spencer / Morrisons / Iceland / S"
}
]
Version: 14
Featured
SharedComputeCapacityEnabled
hiddenlayerscanned
disable-batch : true
huggingface_model_id : mistralai/Mixtral-8x7B-v0.1
inference_compute_allow_list : ['Standard_ND40rs_v2', 'Standard_ND96amsr_A100_v4', 'Standard_ND96asr_v4']
inference_supported_envs : ['vllm']
finetune_compute_allow_list : ['Standard_ND40rs_v2', 'Standard_NC48ads_A100_v4', 'Standard_NC96ads_A100_v4', 'Standard_ND96asr_v4', 'Standard_ND96amsr_A100_v4']
model_specific_defaults : ordereddict({'max_seq_length': 2048, 'apply_lora': 'true', 'apply_deepspeed': 'true', 'deepspeed_stage': '3', 'precision': '16', 'ignore_mismatched_sizes': 'false'})
license : apache-2.0
task : text-generation
author : Mistral AI
benchmark : quality
View in Studio: https://ml.azure.com/registries/azureml/models/mistralai-Mixtral-8x7B-v01/version/14
License: apache-2.0
SharedComputeCapacityEnabled: True
SHA: 985aa055896a8f943d4a9f2572e6ea1341823841
inference-min-sku-spec: 40|8|672|2900
inference-recommended-sku: Standard_ND40rs_v2, Standard_ND96amsr_A100_v4, Standard_ND96asr_v4
finetune-min-sku-spec: 40|2|440|128
finetune-recommended-sku: Standard_ND40rs_v2, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96amsr_A100_v4, Standard_ND96asr_v4
finetuning-tasks: text-generation
languages: EN