Email - Noloquideus/fastapi-template GitHub Wiki

Email Validation Best Practices

A comprehensive guide to proper email validation that respects standards and user experience.

Table of Contents

The Problem

Picture this: You find an amazing service, register an account, enter your email [email protected] (because you love organized inbox management), and... get an error "Invalid email format". Sound familiar?

This isn't just a minor bug. It's a symptom of a deeper problem in development practices and misunderstanding of basic standards. Every time your service rejects a valid email with a "+" symbol, you're literally slamming the door in the face of millions of technically-savvy users.

Understanding Email Sub-addressing

RFC 5322 Standard

The RFC 5322 standard explicitly allows the + symbol in the local part of email addresses (before the @ symbol). This feature is called sub-addressing or "plus addressing".

How It Works

The magic is simple: when a mail server receives an email sent to [email protected], it should deliver it to the mailbox [email protected], ignoring the +tag part.

Example: All emails sent to:

...will be delivered to one mailbox: [email protected]

Why Users Love It

  1. Automatic Filtering: Set up email rules to automatically sort emails with specific tags
  2. Spam Tracking: Register with [email protected] and know exactly who leaked your data
  3. Organization: Use different tags for different services while maintaining one primary email

Wide Support

This feature is supported by all major email providers:

  • Gmail (Google)
  • Outlook (Microsoft)
  • iCloud (Apple)
  • ProtonMail
  • Yahoo Mail
  • And many others

Millions of users worldwide actively use this feature. When your service says their email is "invalid", you're turning away potential customers.

Why Most Validations Fail

1. Frontend Validation with Lazy RegEx (90% of cases)

The most common issue: developers get a task to "validate email", go to Google, copy the first regex pattern they find, and paste it into the code.

❌ Bad RegEx:

// Somewhere in registration form code
const emailRegex = /^[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}$/;

if (!emailRegex.test(emailInput.value)) {
  showError("Enter a valid email!");
}

Notice what's missing? The + symbol. Result: instant error in the UI and frustrated user.

2. Backend Duplication

Even if frontend validation can be bypassed through DevTools, backend often has the same broken RegEx for security reasons. While security validation is important, incorrect validation simply blocks legitimate users.

3. Misguided Abuse Prevention

Some managers and developers consciously block + symbols, thinking it's a security measure.

Their logic: "If we allow plus signs, one person can create 100 accounts with one email and abuse our new user promos!"

Why this logic is flawed:

  1. Gmail dot trick: Gmail ignores dots in addresses. [email protected], [email protected], and [email protected] are the same mailbox. Will you block dots too?

  2. Temporary email services: Dozens of disposable email services exist that solve abuse much more effectively.

  3. You're punishing everyone: Trying to stop a handful of bad actors, you create problems for thousands of legitimate customers.

4. Database Uniqueness Issues

This is a more subtle issue. Imagine your service allows registration with [email protected]. What happens if the same user tries to register with [email protected]?

From the system's perspective, these are two different emails. From the user's perspective, it's one. If email is a unique key in your database, you'll allow creating two accounts tied to one real mailbox.

Solution: Email normalization before database storage.

5. "I Didn't Know"

Many developers simply don't know about sub-addressing. They write code based on their experience, and in their world, email is just [email protected].

Common Mistakes

❌ Broken RegEx Patterns

// TOO RESTRICTIVE - blocks valid emails
/^[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}$/

// STILL BAD - doesn't handle all valid characters
/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/

❌ Inconsistent Validation

# Frontend allows +, backend blocks it
# Or vice versa - creates confusing user experience

❌ No Email Normalization

# Storing raw emails without normalization
# Leads to duplicate accounts for same user
[email protected] -> "[email protected]"
[email protected] -> "[email protected]"

Proper Implementation

βœ… Better RegEx (But Not Perfect)

// Better, but still not perfect
const betterEmailRegex = /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$/;

βœ… Use Validated Libraries

JavaScript:

// Using validator.js
import validator from 'validator';

function isValidEmail(email) {
  return validator.isEmail(email);
}

Python (Pydantic):

from pydantic import BaseModel, EmailStr

class UserRegistration(BaseModel):
    email: EmailStr  # Automatically validates according to RFC standards
    username: str

Python (email-validator):

from email_validator import validate_email, EmailNotValidError

def validate_user_email(email: str) -> bool:
    try:
        # Validate and get result
        valid = validate_email(email)
        # The normalized result (accessible via valid.email)
        return True
    except EmailNotValidError:
        return False

βœ… Email Normalization

def normalize_email(email: str) -> str:
    """
    Normalize email to canonical form for database storage.
    Example: normalize_email('[email protected]') -> '[email protected]'
    """
    email = email.lower().strip()
    
    if '@gmail.com' not in email and '@googlemail.com' not in email:
        # For non-Gmail, just return lowercase
        return email
    
    local_part, domain = email.split('@', 1)
    
    # Remove everything after '+'
    if '+' in local_part:
        local_part = local_part.split('+')[0]
    
    # Remove all dots for Gmail
    local_part = local_part.replace('.', '')
    
    return f"{local_part}@{domain}"

# Usage example
raw_email = "[email protected]"
canonical_email = normalize_email(raw_email)  # "[email protected]"

βœ… Complete FastAPI Implementation

from fastapi import HTTPException
from pydantic import BaseModel, EmailStr, validator
from sqlalchemy import Column, String, Integer
from sqlalchemy.orm import Session

class UserRegistrationRequest(BaseModel):
    email: EmailStr
    username: str
    password: str

class UserService:
    def __init__(self, db: Session):
        self.db = db
    
    def normalize_email(self, email: str) -> str:
        """Normalize email for uniqueness checks."""
        email = email.lower().strip()
        
        # Handle Gmail/Google Mail special cases
        if '@gmail.com' in email or '@googlemail.com' in email:
            local_part, domain = email.split('@', 1)
            
            # Remove everything after '+'
            if '+' in local_part:
                local_part = local_part.split('+')[0]
            
            # Remove all dots
            local_part = local_part.replace('.', '')
            email = f"{local_part}@{domain}"
        
        return email
    
    async def register_user(self, user_data: UserRegistrationRequest) -> User:
        """Register new user with proper email handling."""
        
        # Normalize email for uniqueness check
        normalized_email = self.normalize_email(user_data.email)
        
        # Check if normalized email already exists
        existing_user = self.db.query(User).filter(
            User.normalized_email == normalized_email
        ).first()
        
        if existing_user:
            raise HTTPException(
                status_code=400,
                detail="An account with this email already exists"
            )
        
        # Create user with both original and normalized email
        user = User(
            email=user_data.email,  # Store original for communication
            normalized_email=normalized_email,  # For uniqueness
            username=user_data.username,
            password_hash=hash_password(user_data.password)
        )
        
        self.db.add(user)
        self.db.commit()
        return user

# Database model
class User(Base):
    __tablename__ = "users"
    
    id = Column(Integer, primary_key=True)
    email = Column(String, nullable=False)  # Original email for sending
    normalized_email = Column(String, unique=True, nullable=False, index=True)  # For uniqueness
    username = Column(String, nullable=False)
    password_hash = Column(String, nullable=False)

Business Impact

Lost Customers

Every user who can't register is:

  • Lost revenue
  • Missed lead
  • Potential negative word-of-mouth
  • Bad user experience impression

Technical Debt

Poor email validation creates:

  • Support tickets ("I can't register!")
  • Workarounds in code
  • Inconsistent data
  • Security vulnerabilities

Reputation Damage

Technical users notice these issues and may perceive your product as:

  • Poorly designed
  • Not following standards
  • Unprofessional
  • Not worth their time

Best Practices

For Developers

  1. Use Established Libraries

    # DON'T write your own email validation
    # DO use proven libraries like Pydantic's EmailStr
    
  2. Implement Email Normalization

    # Always normalize before uniqueness checks
    normalized = normalize_email(user_input)
    
  3. Store Both Versions

    # Original for communication, normalized for uniqueness
    user.email = original_email
    user.normalized_email = normalize_email(original_email)
    
  4. Test with Real Examples

    test_emails = [
        "[email protected]",
        "[email protected]", 
        "[email protected]",
        "[email protected]"
    ]
    

For QA Engineers

  1. Always Test Plus Addressing Add to your registration/authentication checklist:

    • Email with + symbol
    • Multiple tags for same base email
    • Mixed case with + symbols
  2. Write Proper Bug Reports

    ❌ Bad: "Email with plus doesn't work"

    βœ… Good:

    Title: Registration form rejects valid email addresses with sub-addressing 
          (containing '+'), violating RFC 5322 standard
    
    Description: 
    - System incorrectly rejects emails like [email protected]
    - This blocks registration for Gmail/Outlook users using email organization
    - Violates RFC 5322 standard for email format
    - Affects user experience and potential conversions
    
    Expected: System should accept and process these addresses correctly
    

For Product Managers

  1. Don't Dismiss This as Minor Each blocked user is lost revenue. Investment in fixing this is investment in:

    • User experience
    • Audience growth
    • Technical credibility
  2. Prioritize Properly This affects:

    • Registration conversion rates
    • Support ticket volume
    • Brand perception among technical users

Testing Checklist

Test Cases for Email Validation

# Valid emails that MUST work
valid_emails = [
    "[email protected]",
    "[email protected]",
    "[email protected]", 
    "[email protected]",
    "[email protected]",
    "[email protected]",
    "[email protected]",
    "[email protected]",
    "[email protected]"
]

# Edge cases
edge_cases = [
    "[email protected]",  # Plus at end
    "[email protected]",  # Double plus
    "[email protected]",  # Multiple tags
]

# Invalid emails that SHOULD be rejected
invalid_emails = [
    "@example.com",  # No local part
    "user@",  # No domain
    "user [email protected]",  # Space in local part
    "user@example",  # No TLD
]

Automated Tests

import pytest
from your_app.validation import validate_email, normalize_email

class TestEmailValidation:
    
    @pytest.mark.parametrize("email", [
        "[email protected]",
        "[email protected]", 
        "[email protected]"
    ])
    def test_plus_addressing_accepted(self, email):
        """Plus addressing emails should be accepted."""
        assert validate_email(email) == True
    
    def test_email_normalization(self):
        """Email normalization should work correctly."""
        test_cases = [
            ("[email protected]", "[email protected]"),
            ("[email protected]", "[email protected]"),
            ("[email protected]", "[email protected]"),
        ]
        
        for input_email, expected in test_cases:
            assert normalize_email(input_email) == expected
    
    def test_duplicate_prevention(self):
        """Normalized emails should prevent duplicates."""
        email1 = "[email protected]"
        email2 = "[email protected]"
        
        normalized1 = normalize_email(email1)
        normalized2 = normalize_email(email2)
        
        assert normalized1 == normalized2  # Should be same for uniqueness

Implementation in FastAPI Template

Updated Models

# src/infrastructure/database/models/user.py
from sqlalchemy import Column, Integer, String, Boolean, DateTime
from sqlalchemy.sql import func
from .base import Base

class User(Base):
    __tablename__ = "users"
    
    id = Column(Integer, primary_key=True)
    email = Column(String, nullable=False)  # Original email
    normalized_email = Column(String, unique=True, nullable=False, index=True)
    username = Column(String, nullable=False)
    password_hash = Column(String, nullable=False)
    is_active = Column(Boolean, default=True)
    created_at = Column(DateTime, server_default=func.now())

Updated Schemas

# src/presentation/schemas/user.py
from pydantic import BaseModel, EmailStr, validator

class UserCreateRequest(BaseModel):
    email: EmailStr  # Pydantic handles RFC-compliant validation
    username: str
    password: str

class UserResponse(BaseModel):
    id: int
    email: str  # Return original email to user
    username: str
    is_active: bool
    
    class Config:
        from_attributes = True

Updated Service

# src/application/services/user_service.py
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from src.infrastructure.database.models.user import User
from src.presentation.schemas.user import UserCreateRequest

class UserService:
    def __init__(self, session: AsyncSession):
        self.session = session
    
    def normalize_email(self, email: str) -> str:
        """Normalize email for uniqueness checks."""
        email = email.lower().strip()
        
        if '@gmail.com' in email or '@googlemail.com' in email:
            local_part, domain = email.split('@', 1)
            if '+' in local_part:
                local_part = local_part.split('+')[0]
            local_part = local_part.replace('.', '')
            email = f"{local_part}@{domain}"
        
        return email
    
    async def create_user(self, user_data: UserCreateRequest) -> User:
        """Create new user with proper email handling."""
        normalized_email = self.normalize_email(user_data.email)
        
        # Check uniqueness using normalized email
        existing_user = await self.session.execute(
            select(User).where(User.normalized_email == normalized_email)
        )
        
        if existing_user.scalar_one_or_none():
            raise ValueError("User with this email already exists")
        
        user = User(
            email=user_data.email,
            normalized_email=normalized_email,
            username=user_data.username,
            password_hash=self.hash_password(user_data.password)
        )
        
        self.session.add(user)
        await self.session.commit()
        await self.session.refresh(user)
        
        return user

Conclusion

Email validation with plus addressing isn't just a technical detailβ€”it's a fundamental aspect of user experience and respect for web standards.

Key takeaways:

  1. Respect RFC 5322: The standard exists for a reason
  2. Use proven libraries: Don't roll your own email validation
  3. Implement normalization: Handle uniqueness properly
  4. Test thoroughly: Include plus addressing in your test cases
  5. Think about users: Technical users notice and appreciate proper implementation

Every time you properly handle email validation, you're showing respect for your users and web standards. It's a small detail that makes a big difference in user experience and technical credibility.


See Also: