DateTime versus TIMESTAMP - jbrucker/home-log GitHub Wiki

When defining columns for a database table using either SqlAlchemy Core or SqlAlchemy ORM, a field that contains a date+time value can be stored as either DateTime or TIMESTAMP. Further, a TIMESTAMP can include a timezone or be timezone-unaware.

What effect does this have on the database table schema? What effect on validation and type mapping using Pydantic?

DateTime (SQLAlchemy)

  • Database Mapping:
    • PostgreSQL: TIMESTAMP WITHOUT TIME ZONE
    • MySQL: DATETIME
    • SQLite: DATETIME (stored as TEXT/ISO8601 string)
  • Behavior:
    • Stores date + time without timezone awareness.
    • Relies on application logic to handle timezones.
  • Example:
    from sqlalchemy import DateTime
    from datetime import datetime
    
    class Measurement(Base):
        timestamp = Column(DateTime, default=datetime.utcnow)  # Application-managed time
    

TIMESTAMP (SQLAlchemy)

  • Database Mapping:
    • PostgreSQL: TIMESTAMP WITH TIME ZONE aka TIMESTAMPTZ if timezone=True, else TIMESTAMP WITHOUT TIME ZONE
    • MySQL: TIMESTAMP (timezone-naive, converts to server timezone)
    • SQLite: DATETIME (no timezone support)
  • Behavior:
    • PostgreSQL: With timezone=True, stores timezone-aware timestamps (recommended).
    • MySQL: Automatically converts to server timezone on write/read.
  • Example:
    from sqlalchemy import TIMESTAMP, func
    
    class Measurement(Base):
        # PostgreSQL: TIMESTAMP WITH TIME ZONE
        timestamp = Column(TIMESTAMP(timezone=True), default=func.now())
    

Key Differences

Feature DateTime TIMESTAMP(timezone=True) (PostgreSQL)
Timezone Handling None (application must manage) Database-managed
Storage Format Raw date/time (no TZ) UTC (converted on write/read)
Portability Consistent across dialects PostgreSQL-specific behavior
Recommended Use Case Simple apps with controlled TZ logic Distributed systems (e.g., cloud apps)

Impact on Database Schema

  1. PostgreSQL:

    • DateTimetimestamp column (no timezone).
    • TIMESTAMP(timezone=True)timestamptz column (with timezone).
    -- SQLAlchemy generates:
    CREATE TABLE measurements (
        id SERIAL PRIMARY KEY,
        timestamp TIMESTAMP,           -- DateTime
        timestamp_tz TIMESTAMPTZ       -- TIMESTAMP(timezone=True)
    );
    
  2. MySQL:

    • DateTimeDATETIME (no auto-conversion).
    • TIMESTAMP → Converts to server timezone on write/read.
  3. SQLite:

    • Both stored as strings (no native time handling).

Example Migration Using Alembic

# For TIMESTAMP with timezone (PostgreSQL)
op.add_column('measurements', sa.Column('timestamp', sa.TIMESTAMP(timezone=True)))

Effect on Validation and Type Mapping using Pydantic

1. Timezone Handling in Validation

SQLAlchemy Type Pydantic Behavior Validation Consideration
DateTime Treats input as timezone-naive by default. Must manually validate/convert timezones in Pydantic (datetime vs datetime_naive).
TIMESTAMP(timezone=True) Expects timezone-aware input (e.g., 2024-01-01T12:00:00+00:00). Pydantic's datetime field automatically validates timezone awareness.

Example Pydantic Model:

from pydantic import BaseModel, field_validator
from datetime import datetime

class MeasurementBase(BaseModel):
    # For SQLAlchemy DateTime (timezone-naive)
    timestamp_naive: datetime  # Will raise error if timezone-aware input is provided

    # For SQLAlchemy TIMESTAMP(timezone=True)
    timestamp_tz: datetime     # Accepts only timezone-aware inputs

    @field_validator('timestamp_naive')
    def validate_naive(cls, v):
        if v.tzinfo is not None:
            raise ValueError("Timestamp must be timezone-naive")
        return v

2. Database vs. Pydantic Type Mapping

Database Type Pydantic Field Behavior
TIMESTAMP WITHOUT TIME ZONE datetime Accepts naive or timezone-aware but strips TZ (may cause silent data loss).
TIMESTAMP WITH TIME ZONE datetime Requires timezone-aware input; converts to UTC on validation.

Example:

# SQLAlchemy Model
class Measurement(Base):
    timestamp_naive = Column(DateTime)          # PostgreSQL: TIMESTAMP WITHOUT TIME ZONE
    timestamp_tz = Column(TIMESTAMP(timezone=True))  # PostgreSQL: TIMESTAMPTZ

# Pydantic Model
class MeasurementCreate(BaseModel):
    timestamp_naive: datetime       # Will strip timezone if provided
    timestamp_tz: datetime          # Requires timezone (e.g., "2024-01-01T12:00:00Z")

3. Default Value Handling

SQLAlchemy Pydantic Behavior
DateTime(default=func.now()) Pydantic ignores DB defaults; must set default_factory=datetime.utcnow in schema.
TIMESTAMP(timezone=True) Pydantic requires explicit timezone in defaults (e.g., datetime.now(timezone.utc)).

Example:


class MeasurementCreate(BaseModel):
    # For DateTime (naive)
    timestamp_naive: datetime = Field(default_factory=datetime.utcnow)

    # For TIMESTAMP(timezone=True)
    timestamp_tz: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))

Recommended Approach

  1. Use TIMESTAMP(timezone=True) + Pydantic datetime

    • Best for consistency in distributed systems.
    • Pydantic will enforce timezone awareness:
      class MeasurementCreate(BaseModel):
          timestamp: datetime  # Must include timezone (e.g., "2024-01-01T12:00:00+00:00")
      
  2. If using DateTime (naive)

    • Add explicit Pydantic validation:
      @field_validator('timestamp')
      def ensure_naive(cls, v):
          if v.tzinfo is not None:
              raise ValueError("Database stores naive timestamps; remove timezone")
          return v.replace(tzinfo=None)
      
  3. For Flexibility
    Use a Union type with conversion:

    from typing import Union
    from datetime import datetime, timezone
    
    class MeasurementCreate(BaseModel):
        timestamp: Union[datetime, str]
    
        @field_validator('timestamp')
        def normalize_timestamp(cls, v):
            if isinstance(v, str):
                v = datetime.fromisoformat(v)
            if v.tzinfo:  # Convert to UTC then naive
                return v.astimezone(timezone.utc).replace(tzinfo=None)
            return v
    

Effect on OpenAPI Docs

Type OpenAPI Schema
DateTime (naive) type: string, format: date-time
TIMESTAMP(timezone) Same, but validation requires TZ info.

Summary

  • TIMESTAMP(timezone=True) + Pydantic datetime provides strongest timezone safety.
  • DateTime requires manual TZ stripping in Pydantic to match SQLAlchemy's naive storage.

If your app serves multiple timezones, prioritize TIMESTAMP(timezone=True). For simplicity (e.g., single-timezone IoT devices), DateTime may suffice with proper validation.

Recommendation

  • Use TIMESTAMP(timezone=True) if:

    • Your app serves users in multiple timezones.
    • You want the database to handle timezone conversions.
  • Use DateTime if:

    • All timestamps are in a known timezone (e.g., UTC-only).
    • You need database portability.
    • You perform timezone conversion (i.e. convert to UTC time) in your application.
  • Always document timezone expectations in your API schema.

Best Practices

  1. Consistency: Stick to one type per project.
  2. UTC Storage: Always store timestamps in UTC (default=func.now() or datetime.utcnow).
  3. Indexing: (Depending on application need) Add indexes for time-range queries:
    __table_args__ = (Index('idx_measurement_time', 'timestamp'),)