models tiiuae falcon 7b instruct - Azure/azureml-assets GitHub Wiki
Falcon-7B-Instruct is a large language model with 7 billion parameters, developed by TII. It is a causal decoder-only model and is released under the Apache 2.0 license. This model is optimized for inference and features FlashAttention and multiquery architectures. It is primarily designed for chat and instruct applications in English and French. However, it may not be suitable for further fine-tuning. It is available under the Apache 2.0 license.
Model Type: Causal decoder-only Languages: English and French License: Apache 2.0 Training Data: Fine-tuned on the Falcon-7B model Architecture: Based on GPT-3 with optimizations including rotary positional embeddings, FlashAttention, and multiquery attention Hardware: Falcon-40B was trained on AWS SageMaker, on 32 A100 40GB GPUs in P4d instances. Software: Utilizes a custom distributed training codebase called Gigatron
Falcon-7B-Instruct may carry biases commonly found online due to its training data. Users are advised to implement guardrails and take precautions for production use. It's mostly suited for English and French and may not generalize well to other languages.
Review the original model card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model.
Falcon-40B-Instruct was finetuned on a 164M tokens from Bai ze mixed with 5% of RefinedWeb data.
The data was tokenized with the Falcon-7B/40B tokenizer.
Falcon-40B-Instruct was trained on AWS SageMaker, on 32 A100 40GB GPUs in P4d instances.
Paper coming soon.
See the OpenLLM Leaderboard for early results.
Falcon-7B is a causal decoder-only model trained on a causal language modeling task (i.e., predict the next token).
The architecture is broadly adapted from the GPT-3 paper (Brown et al., 2020), with the following differences:
- Positionnal embeddings: rotary (Su et al., 2021);
- Attention: multiquery (Shazeer et al., 2019) and FlashAttention (Dao et al., 2022);
- Decoder-block: parallel attention/MLP with a single layer norm.
Hyperparameter | Value | Comment |
---|---|---|
Layers | 32 | |
d_model |
4544 | Increased to compensate for multiquery |
head_dim |
64 | Reduced to optimise for FlashAttention |
Vocabulary | 65024 | |
Sequence length | 2048 |
Falcon-40B-Instruct was trained on AWS SageMaker, on 32 A100 40GB GPUs in P4d instances.
Falcon-40B-Instruct was trained a custom distributed training codebase, Gigatron. It uses a 3D parallelism approach combined with ZeRO and high-performance Triton kernels (FlashAttention, etc.)
Falcon-7B-Instruct is made available under the Apache 2.0 license.
{
"input_data": {
"input_string":["Develop a Python function to sort a list of integers in ascending order"]
}
}
[
{
"0": "You can use the sorted() function in Python to sort a list of integers in ascending order. Here's an example: my_list = [3,1,6,4,1,5] sorted_list = sorted(my_list) print(sorted_list) This will output: [1,1,3,4,5,6]"
}
]
Version: 7
Featured
license : apache-2.0
SharedComputeCapacityEnabled
task : text-generation
author : tiiuae
huggingface_model_id : tiiuae/falcon-7b-instruct
inference_compute_allow_list : ['Standard_NC6s_v3', 'Standard_NC12s_v3', 'Standard_NC24s_v3', 'Standard_ND40rs_v2', 'Standard_ND96asr_v4', 'Standard_ND96amsr_A100_v4']
finetune_compute_allow_list : ['Standard_NC24s_v3', 'Standard_ND40rs_v2', 'Standard_ND96asr_v4', 'Standard_ND96amsr_A100_v4']
evaluation_compute_allow_list : ['Standard_NC24s_v3', 'Standard_ND40rs_v2', 'Standard_ND96asr_v4', 'Standard_ND96amsr_A100_v4']
model_specific_defaults : ordereddict({'apply_lora': 'true', 'precision': '4'})
inference_supported_envs : ['vllm']
View in Studio: https://ml.azure.com/registries/azureml/models/tiiuae-falcon-7b-instruct/version/7
License: apache-2.0
SharedComputeCapacityEnabled: True
SHA: cf4b3c42ce2fdfe24f753f0f0d179202fea59c99
datasets: tiiuae/falcon-refinedweb
languages: en
inference-min-sku-spec: 6|1|112|736
inference-recommended-sku: Standard_NC6s_v3, Standard_NC12s_v3, Standard_NC24s_v3, Standard_ND40rs_v2, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4
evaluation-min-sku-spec: 24|4|448|2900
evaluation-recommended-sku: Standard_NC24s_v3, Standard_ND40rs_v2, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4
finetune-min-sku-spec: 24|4|448|2900
finetune-recommended-sku: Standard_NC24s_v3, Standard_ND40rs_v2, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4
finetuning-tasks: text-classification