4.5.4.Working with different file formats - sj50179/IBM-Data-Science-Professional-Certificate GitHub Wiki

Objectives

  • Define different file formats such as csv, xml, and json
  • Write simple programs to read and output data
  • List with Python libraries are needed to extract data

Python Pandas Library

Pandas Library

import pandas as pd

Reading CSV files

import pandas as pd
file = 'FileExample.csv'
df = pd.read_csv(file)

Using Dataframes

df.coulumns = ['Name', 'Phone Number', 'Birthday']

Reading JSON files

import json
with open('filesample.json', 'r') as openfile:
	json_object = json.load(openfile)
print(json_object)

Reading XML files

import pandas as pd
import xml.etree.ElementTree as etree
tree = etree.parse('fileExample.xml')
root = tree.getroot()
columns = ['Name', 'Phone Number', 'Birthday']
df = pd.DataFrame(columns = columns)

for node in root:
	name = node.find("name").text
	phonenumber = node.find("phonenumber").text
	birthday = node.find("birthday").text

df = df.append(pd.Series([name, phonenumber, birthday], index = columns)..., 
			ignore_index = True)