models mistralai Mixtral 8x7B v01 - Azure/azureml-assets GitHub Wiki

mistralai-Mixtral-8x7B-v01

Overview

Model Details

The Mixtral-8x7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mixtral-8x7B-v0.1 outperforms Llama 2 70B on most benchmarks with 6x faster inference.

For full details of this model please read release blog post.

Model Architecture

Mixtral-8x7B-v0.1 is a decoder-only model with 8 distinct groups or the "experts". At every layer, for every token, a router network chooses two of these experts to process the token and combine their output additively. Mixtral has 46.7B total parameters but only uses 12.9B parameters per token using this technique. This enables the model to perform with same speed and cost as 12.9B model.

Finetuning samples

Task Use case Dataset Python sample (Notebook) CLI with YAML
Text Generation Summarization Samsum summarization_with_text_gen.ipynb text-generation.sh

Inference samples

Inference type Python sample (Notebook) CLI with YAML
Real time text-generation-online-endpoint.ipynb text-generation-online-endpoint.sh
Batch text-generation-batch-endpoint.ipynb coming soon

Sample inputs and outputs

Sample input

{
    "input_data": {
        "input_string": [
            "What is your favourite condiment?",
            "Do you have mayonnaise recipes?"
        ],
        "parameters": {
            "max_new_tokens": 100,
            "do_sample": true,
            "return_full_text": false
        }
    }
}

Sample output

[
  {
    "0": "\n\nMayonnaise - can't be beat.\n\n## If you had to eat one type of food everyday for the rest of your life what would it be?\n\nMango. I'm an avid fruit and vegetable eater.\n\n## What is your favourite fruit and/or vegetable?\n\nMango! I eat an acre of these a year, which is almost two pounds a day.\n\n## What is the strangest food"
  },
  {
    "0": "\n\nWe don't have any mayonnaise recipes - they are too old fashioned!\n\n## I have seen your products in my local Co-op / Waitrose / Spar / Iceland / Marks and Spencers. Where can I buy more?\n\nIf you can't find our products in your local store, ask your Co-op / Sainsburys / Waitrose / Marks & Spencer / Morrisons / Iceland / S"
  }
]

Version: 14

Tags

Featured SharedComputeCapacityEnabled hiddenlayerscanned disable-batch : true huggingface_model_id : mistralai/Mixtral-8x7B-v0.1 inference_compute_allow_list : ['Standard_ND40rs_v2', 'Standard_ND96amsr_A100_v4', 'Standard_ND96asr_v4'] inference_supported_envs : ['vllm'] finetune_compute_allow_list : ['Standard_ND40rs_v2', 'Standard_NC48ads_A100_v4', 'Standard_NC96ads_A100_v4', 'Standard_ND96asr_v4', 'Standard_ND96amsr_A100_v4'] model_specific_defaults : ordereddict({'max_seq_length': 2048, 'apply_lora': 'true', 'apply_deepspeed': 'true', 'deepspeed_stage': '3', 'precision': '16', 'ignore_mismatched_sizes': 'false'}) license : apache-2.0 task : text-generation author : Mistral AI benchmark : quality

View in Studio: https://ml.azure.com/registries/azureml/models/mistralai-Mixtral-8x7B-v01/version/14

License: apache-2.0

Properties

SharedComputeCapacityEnabled: True

SHA: 985aa055896a8f943d4a9f2572e6ea1341823841

inference-min-sku-spec: 40|8|672|2900

inference-recommended-sku: Standard_ND40rs_v2, Standard_ND96amsr_A100_v4, Standard_ND96asr_v4

finetune-min-sku-spec: 40|2|440|128

finetune-recommended-sku: Standard_ND40rs_v2, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96amsr_A100_v4, Standard_ND96asr_v4

finetuning-tasks: text-generation

languages: EN

⚠️ **GitHub.com Fallback** ⚠️