File and Data Serialization - CameronAuler/python-devops GitHub Wiki

Data serialization allows storing and transferring structured data in formats like CSV, JSON, XML, and Pickle. This is essential for saving, sharing, and loading data in different applications.

Table of Contents

CSV Files

CSV (Comma-Separated Values) is a tabular format where values are separated by commas. It is used in spreadsheets, databases, and data exchanges.

Reading CSV Files

Use case: Extracting data from spreadsheets, logs, and databases.

import csv

with open("data.csv", mode="r", newline="") as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)  # Each row is a list of values

Writing CSV Files

Use case: Saving structured data for easy storage and sharing.

import csv

data = [["Name", "Age"], ["Alice", 25], ["Bob", 30]]

with open("output.csv", mode="w", newline="") as file:
    writer = csv.writer(file)
    writer.writerows(data)  # Write multiple rows

Using CSV with Dictionaries

Use cases: Handling CSV files with named columns.

import csv

with open("data.csv", mode="r", newline="") as file:
    reader = csv.DictReader(file)  # Reads CSV into dictionaries
    for row in reader:
        print(row["Name"], row["Age"])

JSON Files

JavaScript Object Notation (JSON) is a human-readable format used for APIs and data storage. It supports nested structures (dictionaries & lists).

Reading JSON Files

Use cases: Processing API responses and structured configuration files.

import json

with open("data.json", "r") as file:
    data = json.load(file)  # Load JSON data into a Python dictionary

print(data)

Writing JSON Files

Use cases: Saving structured data for configuration, APIs, and inter-service communication.

import json

data = {"name": "Alice", "age": 25}

with open("output.json", "w") as file:
    json.dump(data, file, indent=4)  # Save JSON with indentation

Converting Between JSON and Python Objects

Use cases: Converting Python objects to JSON for storage, APIs, and web development.

import json

json_str = '{"name": "Bob", "age": 30}'
python_dict = json.loads(json_str)  # Convert JSON string to Python dictionary
print(python_dict["name"])

python_to_json = json.dumps(python_dict, indent=4)  # Convert Python dictionary to JSON string
print(python_to_json)

XML Files

XML (Extensible Markup Language) stores hierarchical data using tags. It is commonly used for web services, configurations, and structured data storage.

Parsing XML Data

Use case: Parsing data from XML-based web services and configurations.

import xml.etree.ElementTree as ET

xml_data = """<data>
    <person>
        <name>Alice</name>
        <age>25</age>
    </person>
</data>"""

root = ET.fromstring(xml_data)  # Parse XML string

for person in root.findall("person"):
    name = person.find("name").text
    age = person.find("age").text
    print(name, age)

Writing XML Data

Use case: Saving structured hierarchical data in XML format.

import xml.etree.ElementTree as ET

root = ET.Element("data")  # Root element
person = ET.SubElement(root, "person")
ET.SubElement(person, "name").text = "Alice"
ET.SubElement(person, "age").text = "25"

tree = ET.ElementTree(root)
tree.write("output.xml")

pickle (Object Serialization)

The pickle module serializes Python objects into a binary format. It allows for saving complex objects like dictionaries, lists, or custom classes.

Saving (Pickling) an Object

Use case: Saving Python objects for later use.

import pickle

data = {"name": "Alice", "age": 25}

with open("data.pkl", "wb") as file:  # "wb" mode for writing in binary
    pickle.dump(data, file)

Loading (Unpickling) an Object

Use case: Restoring objects from a saved state.

import pickle

with open("data.pkl", "rb") as file:  # "rb" mode for reading binary
    data = pickle.load(file)

print(data)  # Output: {'name': 'Alice', 'age': 25}

Pickling a Custom Class Object

Use case: Storing Python class instances for later use.

import pickle

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

person = Person("Alice", 25)

# Save object
with open("person.pkl", "wb") as file:
    pickle.dump(person, file)

# Load object
with open("person.pkl", "rb") as file:
    loaded_person = pickle.load(file)

print(loaded_person.name, loaded_person.age)  # Alice 25

Data Format Use Cases

Format Best Use Case
CSV Tabular data (spreadsheets, reports, logs)
JSON API communication, config files, lightweight storage
XML Hierarchical data (configurations, web services)
Pickle Saving Python objects (dictionaries, classes)
⚠️ **GitHub.com Fallback** ⚠️