ORM Models and Schema Models - jbrucker/home-log GitHub Wiki
ORM Models describe the structure of entity objects that are persisted to database or other persistent storage. SqlAlchemy defines the syntax for ORM Models.
Schema Models describe how entity objects are represented. Schema models also provide validation rules and serialization/deserialization. Pydantic defines the syntax for Schema models.
You can Populate a ORM Model from a Schema Model and vice versa.
Referenced Objects and Reverse Relationships, how to define an attribute for an object referenced by a foreign key, and how to create a corresponding reverse relationship. By default, SqlAlchemy uses lazy instantiation but you can require eager instantiation either in the schema or as an option in a query.
Validation can be done by Pydantic or SqlAlchemy.
ORM Table Models
The SqlAlchemy 2.0 official style adds annotated type hints to database-mapped attributes:
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
from sqlalchemy import Integer, String
from sqlalchemy.ext.async import AsyncAttrs
class Base(AsyncAttrs, DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True)
username: Mapped[str] = mapped_column(String(50), nullable=False)
class UserPassword(Base):
__tablename__ = "user_passwords"
hashed_password: Mapped[str] = mapped_column(String, nullable=False)
user_id: Mapped[int] = mapped_column(ForeignKey("users.id", ondelete="CASCADE"), nullable=False)
Inferred values can be omitted, such as Integer for primary key(in code above).
You can completely omit mapped_column and accepted inferred or default values, as in:
username: Mapped[str] # inferred column type is `String`
Explanation
Mapped[int]is a generic marker meaning "this is an ORM-mapped attribute"mapped_column(...)creates a Column object under the hood, but designed to work with Python typing.- in
mapped_columnit is not necessary to specify the datatype if it can be inferred from the type hint. - the entire
mapped_columncan be omitted if you want the default datatype and inferred properties, i.e.username: Mapped[str]defaults tomapped_column('String', nullable=True) - specify the data type if you need to add detail (
String(50)orDateTime(timezone=True)), disambiguate, or add options:created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False)
Old-Style ORM Models
Prior to SqlAlchemy 2.0, ORM table models were written as:
Base = declarative_base()
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True, nullable=False)
username = Column(String(50), unique=True, nullable=False)
Schema Models
You can have multiple Pydantic schema models for one persistence model. Each schema provides a subset of attributes for a particular purpose.
import pydantic
from pydantic import EmailStr
from datetime import datetime, timezone
class UserCreate(pydantic.BaseModel):
"""User attributes that are given to a service endpoint to create a new User entity."""
email: EmailStr
username: Optional[str] = None
class User(UserCreate):
"""The complete User schema."""
id: int
# In model classes, these default to current time
created_at: datetime = datetime.now(timezone.utc)
updated_at: datetime = datetime.now(timezone.utc)
# model_config replaces the Config inner-class in Pydantic 2.0
model_config = ConfigDict(from_attributes=True)
or:
model_config = ConfigDict(from_attributes=True, model_class="User")
Populate a ORM Model from a Schema Model
Any of these techniques can be used.
1. Use **user_data.model_dump() (Pydantic v2) or **user_data.dict() (v1 or v2)
def save_user(user_data: schemas.UserCreate):
user = models.User(**user_data.model_dump())
- This assumes attribute names in
UserCreateschema match those inUsermodel.
2. Explicit Assignment of Attributes
def save_user(user_data: schemas.UserCreate):
user = models.User(
username=user_data.username,
email=user_data.email
)
- Requires manual updating if new attributes are added to model.
3. Factory method in User model or UserCreate schema
Define your own method to perform the conversion. Models should not depend on schema, so put the method in UserCreate schema or a separate factory class:
def save_user(user_data: schemas.UserCreate):
user = user_data.as_model()
# schemas class
class UserCreate(BaseModel):
def as_model(self) -> models.User:
return models.User(username=self.username, email=self.email)
Validation
Validation is done by Pydantic.
-
Schema classes automatically apply validation rules when you create a new schema object, but not if you assign a new value to an object.
import schemas user = schema.UserCreate(username="Santa", email="[email protected]") # but doesn't validate email here: user.email = "santa@" -
**
model_validate(obj)** class method validates the parameter.obj` can be a model, a dict, or another schema object and returns a new Pydantic model instance.import schemas user_in = schemas.UserCreate(username="Santa", email="[email protected]") # validate & create a different schema object (User) user = schemas.User.model_validate(user_in) data = {'username': 'harry', 'email': 'hackers@com'} user = schemas.User.model_validate(data) # raises ValidationError because 'email' is malformed -
model_validate_json(json_data)validate JSON data (against a schema class) and returns an instance of the schema class.
Update an Existing Model Instance from a Pydantic Schema instance
Suppose user is an existing SqlAlchemy model (models.User) and user_data is a schema instance (schemas.User).
To update only explicitly set fields:
update_data = user_data.model_dump(exclude_unset=True)
for field, value in update_data.items():
setattr(user, field, value)
Referenced Objects and Reverse Relationships
A DataSource model contains a owner_id foreign key referencing the primary key of a User:
class DataSource(Base):
"""A source of data values, such as a meter or sensor."""
__tablename__ = "data_sources"
id: Mapped[int] = mapped_column(primary_key=True, nullable=False)
name: Mapped[str] = mapped_column(String(MAX_NAME), nullable=False)
owner_id: Mapped[int] = mapped_column(
Integer,
# Poor Abstraction: "users.id" instead of User.id
ForeignKey("users.id", ondelete="SET NULL"),
nullable=True
)
You can define an attribute for the object referenced by owner_id:
class DataSource(Base):
owner: Mapped[User] = relationship("User")
By default, SqlAlchemy uses lazy instantiation so that this query:
source_id = 1
source = await session.get(DataSource, source_id)
does not instantiate the source.owner reference. owner is lazily instantiated when referenced in code:
owner_name = source.owner.name # owner (User) is instantiated
this can cause an async error ("Missing Greenlet") or session expired error if done outside the scope of the session.
Eager Instantiation in Query
To request that owner be populated in the query for DataSource, use:
source = await session.get(DataSource, source_id,
options=[joinedload(DataSource.owner)])
joinedload uses a JOIN so that both objects are fetched in a single query.
To do the same thing is a select statement:
stmt = select(DataSource).options(joinedload(DataSource.owner)).where(DataSource.id == source_id)
result = await session.execute(stmt)
source = result.scalar_one_or_none()
An alternative to joinedload is selectinload, which uses 2 queries but may be better for large 1-to-many collections.
Reverse Relationships
To define a synchronized, 2-way association use one of these options in the relationship:
backpopulates- an explicit, 2-sided definitionbackref- 1-side defined both ends of the relationship
Both of these can be used with other options for lazy/eager instantiation, cascading (deletion), 1-to-many, and whether each end should be a list or single value.
Reverse Relationship using back_populates
Explicitly define each end of the association. Each end may have different options.
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True)
...
data_sources: Mapped[list[DataSource]] = relationship(
"DataSource", # can omit class name if it can be inferred.
backpopulates="user",
cascade="all, delete-orphan",
lazy="select" # lazily load objects when accessed
)
class DataSource(Base):
__tablename__ = "data_sources"
id: Mapped[int] = mapped_column(primary_key=True)
owner_id: Mapped[int] = mapped_column(
Integer,
ForeignKey("users.id", ondelete="SET NULL"),
nullable=True
)
owner = relationship("User",
back_populates="data_sources",
lazy="joined" # use JOIN to eagerly fetch related object
)
Reverse Relationship using backref
One end declares and controls both ends of the relationship.
from sqlalchemy.orm import backref
class User(Base):
__tablename__ = 'users'
id = mapped_column(Integer, primary_key=True)
data_sources = relationship("DataSource", backref="user")
This automatically creates the user attribute on UserPassword.
If you want to customize the other side, use backref(...) with parameters:
data_sources = relationship(
"Datasource",
backref=backref("user", lazy="joined", cascade="all, delete")
)
How back_populates and backref differ:
| Characteristic | back_populates |
backref |
|---|---|---|
| Explicit declaration | Declare relationships on both sides | Declare both sides in one place |
| Control | More control; each side has its own config | Both sides share config |
| Clarity | Explicit; easier to read in large schemas | More concise for simple relationships |
| Customization | Each side can have independent cascade, lazy, etc. |
Must use backref() to customize other side |
Options for back_populates and backref
-
lazy=- loading strategy affects how related objects are loadedOption Behavior "select"Default. Loads related objects on access (lazy load). "joined"Uses JOINto eagerly load related object in same query."selectin"Loads related object(s) with a separate SELECT ... INquery."immediate"Loads the related objects as soon as the parent is loaded (rarely used). "noload"Does not load the relationship at all (returns empty list or None)."raise"Raises an error if the relationship is accessed (used to enforce manual loading). -
cascade=- how operations on parent affect the related childrenCascade Rule Meaning "save-update"Propagate session.add()and updates."merge"Merge operations will cascade. "expunge"Removing parent from session removes children too. "delete"Deleting parent will also delete the children (if not orphaned). "delete-orphan"If a child is removed from the collection, it will be deleted. "all"Includes all except "delete-orphan"."all, delete-orphan"Typical for one-to-many where orphaned children should be deleted. -
uselist=controls where relationship returns a list or scalar. Normallyuselist=Falseneeded only for 1-to-1 relationships. It is inferred on the "many" end of many-to-one relationships.Setting Behavior uselist=True(default)Returns a list (for one-to-many) uselist=FalseReturns a scalar (for one-to-one or many-to-one)