Get and Export Data - setiamanlhc/python-snippet-code GitHub Wiki

Read Data from CSV

To use pandas library first you need to import into your Jupyter notebook. As pandas built on top of numpy, you shall import numpy first.

import numpy as np
import pandas as pd
import seaborn as sns

Next is to import superstore dataset by using pandas read_csv function. read_csv only required one mandatory parameter, your CSV file name. If you CSV has different separator other then comma, you shall add 'sep' parameter to tell pandas to read your file correctly.

df = pd.read_csv('..\\raw_data\\hotel_booking_dataset.csv')

df = pd.read_csv('Superstore.csv', sep=',', quotechar='"')

# import and apply null value to 'no info' and '.' values during import
df = pd.read_csv("data/cereal.csv", skiprows = 1, na_values = ['no info', '.'])

# import subset of columns
pd.read_csv('file_name.csv', usecols= ['column_name1','column_name2'])

# import with Index column
pd.read_csv('file_name.csv',index_col='Name') # Use 'Name' column as index

Read Excel

df = pd.read_excel(SOURCE, sheet_name='Sheet1', dtype = {'col name': np.float64 | np.int32 | object})

Read Excek with limited column. Folder must be separarted with double backslash.

SOURCE = "C:\\01WorkingFiles\\datafile.xlsx"
columns = ["Main Question", "Reference to Activity List  (Y/N)","Variable", "Label","Shortened Label","Topic"]
df = pd.read_excel(SOURCE, sheet_name='Variable', usecols=columns)

Loading sample data from Seaborn library

iris = sns.load_dataset('iris')

Export to CSV with separator and header option

df.to_csv('Question_bank.csv', index=False, header=True, quotechar='"')

df.to_csv('mydata.csv')

Copy data to Clipbard so you can paste it to excel

df_samples.to_clipboard(True)

import data from Clipboard

df_samples.from_clipboard()