distilbert-base-uncased-distilled-squad

Overview

DistilBERT model was proposed in the blog post Smaller, faster, cheaper, lighter: Introducing DistilBERT, adistilled version of BERT, and the paper DistilBERT, adistilled version of BERT: smaller, faster, cheaper and lighter. DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than bert-base-uncased, runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language understanding benchmark.

This model is a fine-tune checkpoint of DistilBERT-base-uncased, fine-tuned using (a second step of) knowledge distillation on SQuAD v1.1.

Training Details

Training Data

The distilbert-base-uncased model model describes it's training data as:

DistilBERT pretrained on the same data as BERT, which is BookCorpus, a dataset consisting of 11,038 unpublished books and English Wikipedia (excluding lists, tables and headers).

To learn more about the SQuAD v1.1 dataset, see the SQuAD v1.1 data card.

Training Procedure

Preprocessing

See the distilbert-base-uncased model card for further details.

Pretraining

See the distilbert-base-uncased model card for further details.

Evaluation Results

As discussed in the model repository

This model reaches a F1 score of 86.9 on the [SQuAD v1.1] dev set (for comparison, Bert bert-base-uncased version reaches a F1 score of 88.5).

Limitations and Biases

Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)). Predictions generated by the model can include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.

The model should not be used to intentionally create hostile or alienating environments for people. In addition, the model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.

Model Evaluation samples

Task	Use case	Dataset	Python sample (Notebook)	CLI with YAML
Question Answering	Extractive Q&A	Squad v2	evaluate-model-question-answering.ipynb	evaluate-model-question-answering.yml

Inference samples

Inference type	Python sample (Notebook)
Real time	sdk-example.ipynb
Real time	question-answering-online-endpoint.ipynb

Sample inputs and outputs

Sample input

{
    "input_data": {
        "question": "What's my name?",
        "context": "My name is John and I live in Seattle"
    }
}

Sample output

[
  "John"
]

Version: 13

View in Studio: https://ml.azure.com/registries/azureml/models/distilbert-base-uncased-distilled-squad/version/13

Properties

SHA: bb133e834d7dab8aa8eb3f04e0435db7a3a1ddc8

models distilbert base uncased distilled squad - Azure/azureml-assets GitHub Wiki

distilbert-base-uncased-distilled-squad

Overview

Training Details

Training Data

Training Procedure

Preprocessing

Pretraining

Evaluation Results

Limitations and Biases

Model Evaluation samples

Inference samples

Sample inputs and outputs

Sample input

Sample output

Properties

⚠️ GitHub.com Fallback ⚠️

models distilbert base uncased distilled squad - Azure/azureml-assets GitHub Wiki

distilbert-base-uncased-distilled-squad

Overview

Training Details

Training Data

Training Procedure

Preprocessing

Pretraining

Evaluation Results

Limitations and Biases

Model Evaluation samples

Inference samples

Sample inputs and outputs

Sample input

Sample output

Properties

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️