Email - Noloquideus/fastapi-template GitHub Wiki
Email Validation Best Practices
A comprehensive guide to proper email validation that respects standards and user experience.
Table of Contents
- The Problem
- Understanding Email Sub-addressing
- Why Most Validations Fail
- Common Mistakes
- Proper Implementation
- Business Impact
- Best Practices
- Testing Checklist
The Problem
Picture this: You find an amazing service, register an account, enter your email [email protected] (because you love organized inbox management), and... get an error "Invalid email format". Sound familiar?
This isn't just a minor bug. It's a symptom of a deeper problem in development practices and misunderstanding of basic standards. Every time your service rejects a valid email with a "+" symbol, you're literally slamming the door in the face of millions of technically-savvy users.
Understanding Email Sub-addressing
RFC 5322 Standard
The RFC 5322 standard explicitly allows the + symbol in the local part of email addresses (before the @ symbol). This feature is called sub-addressing or "plus addressing".
How It Works
The magic is simple: when a mail server receives an email sent to [email protected], it should deliver it to the mailbox [email protected], ignoring the +tag part.
Example: All emails sent to:
...will be delivered to one mailbox: [email protected]
Why Users Love It
- Automatic Filtering: Set up email rules to automatically sort emails with specific tags
- Spam Tracking: Register with
[email protected]and know exactly who leaked your data - Organization: Use different tags for different services while maintaining one primary email
Wide Support
This feature is supported by all major email providers:
- Gmail (Google)
- Outlook (Microsoft)
- iCloud (Apple)
- ProtonMail
- Yahoo Mail
- And many others
Millions of users worldwide actively use this feature. When your service says their email is "invalid", you're turning away potential customers.
Why Most Validations Fail
1. Frontend Validation with Lazy RegEx (90% of cases)
The most common issue: developers get a task to "validate email", go to Google, copy the first regex pattern they find, and paste it into the code.
β Bad RegEx:
// Somewhere in registration form code
const emailRegex = /^[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}$/;
if (!emailRegex.test(emailInput.value)) {
showError("Enter a valid email!");
}
Notice what's missing? The + symbol. Result: instant error in the UI and frustrated user.
2. Backend Duplication
Even if frontend validation can be bypassed through DevTools, backend often has the same broken RegEx for security reasons. While security validation is important, incorrect validation simply blocks legitimate users.
3. Misguided Abuse Prevention
Some managers and developers consciously block + symbols, thinking it's a security measure.
Their logic: "If we allow plus signs, one person can create 100 accounts with one email and abuse our new user promos!"
Why this logic is flawed:
-
Gmail dot trick: Gmail ignores dots in addresses.
[email protected],[email protected], and[email protected]are the same mailbox. Will you block dots too? -
Temporary email services: Dozens of disposable email services exist that solve abuse much more effectively.
-
You're punishing everyone: Trying to stop a handful of bad actors, you create problems for thousands of legitimate customers.
4. Database Uniqueness Issues
This is a more subtle issue. Imagine your service allows registration with [email protected]. What happens if the same user tries to register with [email protected]?
From the system's perspective, these are two different emails. From the user's perspective, it's one. If email is a unique key in your database, you'll allow creating two accounts tied to one real mailbox.
Solution: Email normalization before database storage.
5. "I Didn't Know"
Many developers simply don't know about sub-addressing. They write code based on their experience, and in their world, email is just [email protected].
Common Mistakes
β Broken RegEx Patterns
// TOO RESTRICTIVE - blocks valid emails
/^[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}$/
// STILL BAD - doesn't handle all valid characters
/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/
β Inconsistent Validation
# Frontend allows +, backend blocks it
# Or vice versa - creates confusing user experience
β No Email Normalization
# Storing raw emails without normalization
# Leads to duplicate accounts for same user
[email protected] -> "[email protected]"
[email protected] -> "[email protected]"
Proper Implementation
β Better RegEx (But Not Perfect)
// Better, but still not perfect
const betterEmailRegex = /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$/;
β Use Validated Libraries
JavaScript:
// Using validator.js
import validator from 'validator';
function isValidEmail(email) {
return validator.isEmail(email);
}
Python (Pydantic):
from pydantic import BaseModel, EmailStr
class UserRegistration(BaseModel):
email: EmailStr # Automatically validates according to RFC standards
username: str
Python (email-validator):
from email_validator import validate_email, EmailNotValidError
def validate_user_email(email: str) -> bool:
try:
# Validate and get result
valid = validate_email(email)
# The normalized result (accessible via valid.email)
return True
except EmailNotValidError:
return False
β Email Normalization
def normalize_email(email: str) -> str:
"""
Normalize email to canonical form for database storage.
Example: normalize_email('[email protected]') -> '[email protected]'
"""
email = email.lower().strip()
if '@gmail.com' not in email and '@googlemail.com' not in email:
# For non-Gmail, just return lowercase
return email
local_part, domain = email.split('@', 1)
# Remove everything after '+'
if '+' in local_part:
local_part = local_part.split('+')[0]
# Remove all dots for Gmail
local_part = local_part.replace('.', '')
return f"{local_part}@{domain}"
# Usage example
raw_email = "[email protected]"
canonical_email = normalize_email(raw_email) # "[email protected]"
β Complete FastAPI Implementation
from fastapi import HTTPException
from pydantic import BaseModel, EmailStr, validator
from sqlalchemy import Column, String, Integer
from sqlalchemy.orm import Session
class UserRegistrationRequest(BaseModel):
email: EmailStr
username: str
password: str
class UserService:
def __init__(self, db: Session):
self.db = db
def normalize_email(self, email: str) -> str:
"""Normalize email for uniqueness checks."""
email = email.lower().strip()
# Handle Gmail/Google Mail special cases
if '@gmail.com' in email or '@googlemail.com' in email:
local_part, domain = email.split('@', 1)
# Remove everything after '+'
if '+' in local_part:
local_part = local_part.split('+')[0]
# Remove all dots
local_part = local_part.replace('.', '')
email = f"{local_part}@{domain}"
return email
async def register_user(self, user_data: UserRegistrationRequest) -> User:
"""Register new user with proper email handling."""
# Normalize email for uniqueness check
normalized_email = self.normalize_email(user_data.email)
# Check if normalized email already exists
existing_user = self.db.query(User).filter(
User.normalized_email == normalized_email
).first()
if existing_user:
raise HTTPException(
status_code=400,
detail="An account with this email already exists"
)
# Create user with both original and normalized email
user = User(
email=user_data.email, # Store original for communication
normalized_email=normalized_email, # For uniqueness
username=user_data.username,
password_hash=hash_password(user_data.password)
)
self.db.add(user)
self.db.commit()
return user
# Database model
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True)
email = Column(String, nullable=False) # Original email for sending
normalized_email = Column(String, unique=True, nullable=False, index=True) # For uniqueness
username = Column(String, nullable=False)
password_hash = Column(String, nullable=False)
Business Impact
Lost Customers
Every user who can't register is:
- Lost revenue
- Missed lead
- Potential negative word-of-mouth
- Bad user experience impression
Technical Debt
Poor email validation creates:
- Support tickets ("I can't register!")
- Workarounds in code
- Inconsistent data
- Security vulnerabilities
Reputation Damage
Technical users notice these issues and may perceive your product as:
- Poorly designed
- Not following standards
- Unprofessional
- Not worth their time
Best Practices
For Developers
-
Use Established Libraries
# DON'T write your own email validation # DO use proven libraries like Pydantic's EmailStr -
Implement Email Normalization
# Always normalize before uniqueness checks normalized = normalize_email(user_input) -
Store Both Versions
# Original for communication, normalized for uniqueness user.email = original_email user.normalized_email = normalize_email(original_email) -
Test with Real Examples
test_emails = [ "[email protected]", "[email protected]", "[email protected]", "[email protected]" ]
For QA Engineers
-
Always Test Plus Addressing Add to your registration/authentication checklist:
- Email with + symbol
- Multiple tags for same base email
- Mixed case with + symbols
-
Write Proper Bug Reports
β Bad: "Email with plus doesn't work"
β Good:
Title: Registration form rejects valid email addresses with sub-addressing (containing '+'), violating RFC 5322 standard Description: - System incorrectly rejects emails like [email protected] - This blocks registration for Gmail/Outlook users using email organization - Violates RFC 5322 standard for email format - Affects user experience and potential conversions Expected: System should accept and process these addresses correctly
For Product Managers
-
Don't Dismiss This as Minor Each blocked user is lost revenue. Investment in fixing this is investment in:
- User experience
- Audience growth
- Technical credibility
-
Prioritize Properly This affects:
- Registration conversion rates
- Support ticket volume
- Brand perception among technical users
Testing Checklist
Test Cases for Email Validation
# Valid emails that MUST work
valid_emails = [
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]"
]
# Edge cases
edge_cases = [
"[email protected]", # Plus at end
"[email protected]", # Double plus
"[email protected]", # Multiple tags
]
# Invalid emails that SHOULD be rejected
invalid_emails = [
"@example.com", # No local part
"user@", # No domain
"user [email protected]", # Space in local part
"user@example", # No TLD
]
Automated Tests
import pytest
from your_app.validation import validate_email, normalize_email
class TestEmailValidation:
@pytest.mark.parametrize("email", [
"[email protected]",
"[email protected]",
"[email protected]"
])
def test_plus_addressing_accepted(self, email):
"""Plus addressing emails should be accepted."""
assert validate_email(email) == True
def test_email_normalization(self):
"""Email normalization should work correctly."""
test_cases = [
("[email protected]", "[email protected]"),
("[email protected]", "[email protected]"),
("[email protected]", "[email protected]"),
]
for input_email, expected in test_cases:
assert normalize_email(input_email) == expected
def test_duplicate_prevention(self):
"""Normalized emails should prevent duplicates."""
email1 = "[email protected]"
email2 = "[email protected]"
normalized1 = normalize_email(email1)
normalized2 = normalize_email(email2)
assert normalized1 == normalized2 # Should be same for uniqueness
Implementation in FastAPI Template
Updated Models
# src/infrastructure/database/models/user.py
from sqlalchemy import Column, Integer, String, Boolean, DateTime
from sqlalchemy.sql import func
from .base import Base
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True)
email = Column(String, nullable=False) # Original email
normalized_email = Column(String, unique=True, nullable=False, index=True)
username = Column(String, nullable=False)
password_hash = Column(String, nullable=False)
is_active = Column(Boolean, default=True)
created_at = Column(DateTime, server_default=func.now())
Updated Schemas
# src/presentation/schemas/user.py
from pydantic import BaseModel, EmailStr, validator
class UserCreateRequest(BaseModel):
email: EmailStr # Pydantic handles RFC-compliant validation
username: str
password: str
class UserResponse(BaseModel):
id: int
email: str # Return original email to user
username: str
is_active: bool
class Config:
from_attributes = True
Updated Service
# src/application/services/user_service.py
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from src.infrastructure.database.models.user import User
from src.presentation.schemas.user import UserCreateRequest
class UserService:
def __init__(self, session: AsyncSession):
self.session = session
def normalize_email(self, email: str) -> str:
"""Normalize email for uniqueness checks."""
email = email.lower().strip()
if '@gmail.com' in email or '@googlemail.com' in email:
local_part, domain = email.split('@', 1)
if '+' in local_part:
local_part = local_part.split('+')[0]
local_part = local_part.replace('.', '')
email = f"{local_part}@{domain}"
return email
async def create_user(self, user_data: UserCreateRequest) -> User:
"""Create new user with proper email handling."""
normalized_email = self.normalize_email(user_data.email)
# Check uniqueness using normalized email
existing_user = await self.session.execute(
select(User).where(User.normalized_email == normalized_email)
)
if existing_user.scalar_one_or_none():
raise ValueError("User with this email already exists")
user = User(
email=user_data.email,
normalized_email=normalized_email,
username=user_data.username,
password_hash=self.hash_password(user_data.password)
)
self.session.add(user)
await self.session.commit()
await self.session.refresh(user)
return user
Conclusion
Email validation with plus addressing isn't just a technical detailβit's a fundamental aspect of user experience and respect for web standards.
Key takeaways:
- Respect RFC 5322: The standard exists for a reason
- Use proven libraries: Don't roll your own email validation
- Implement normalization: Handle uniqueness properly
- Test thoroughly: Include plus addressing in your test cases
- Think about users: Technical users notice and appreciate proper implementation
Every time you properly handle email validation, you're showing respect for your users and web standards. It's a small detail that makes a big difference in user experience and technical credibility.
See Also:
- API Development - Creating robust API endpoints
- Validation - Request/response validation patterns
- Testing Guide - Comprehensive testing strategies
- User Management - Complete user lifecycle management