Kaggle Datasets Download - IUHEPandML/DataAnalysis GitHub Wiki

How to Download Data from Kaggle Using the Command Line

Note: If you haven't installed the Kaggle CLI yet, start from Step 4. After completing that step, return to Step 1.

1. Create a Kaggle Account

If you don’t already have a Kaggle account, you’ll need to create one. Sign up at Kaggle’s website.

2. Generate a Kaggle API Token

  • Log in to your Kaggle account.
  • Go to the 'Account' tab in your profile settings.
  • Under the 'API' section, click 'Create New API Token'.
  • This will download a file named kaggle.json. This file contains your API credentials.

Create API Token

3. Place the kaggle.json File

  • Move the kaggle.json file to the directory where you want to download your dataset. This is typically your home directory or a specific project folder.

4. Install the Kaggle CLI

Open your terminal and run the following command to install the Kaggle CLI:

pip install kaggle

This command installs the Kaggle CLI, allowing you to use the kaggle command.

5. Find the Dataset API Link

  • Go to Kaggle and navigate to the dataset you want to download.
  • Find the API link for the dataset, usually located on the dataset’s page. This link is needed to download the dataset.

Find API Link

6. Download the Dataset

Use the kaggle command to download the dataset. In your terminal, run:

kaggle datasets download -d <dataset-owner/dataset-name>

Replace <dataset-owner/dataset-name> with the appropriate dataset identifier you copied from the Kaggle API link.

Summary

To download a dataset from Kaggle using the command line:

  1. Create a Kaggle account and generate an API token.
  2. Place the kaggle.json file in your desired directory.
  3. Install the Kaggle CLI using pip.
  4. Find the dataset’s API link on Kaggle.
  5. Download the dataset using the kaggle command.

For better organization, consider creating a new directory using the mkdir command and performing all operations within this directory to keep things tidy.