Regular Expressions - potatoscript/php GitHub Wiki

Regular Expressions in PHP

Overview

Regular expressions (regex) are powerful tools used to match patterns in strings. In PHP, regular expressions are used for searching, replacing, and validating text. The PHP preg_* functions are based on Perl-compatible regular expressions (PCRE).

In this section, we will cover:

  • Basics of Regular Expressions
  • Using Regular Expressions in PHP
  • Pattern Modifiers
  • Common Regular Expression Functions
  • Examples of Regular Expressions
  • Best Practices

Basics of Regular Expressions

A regular expression is a string pattern that describes a set of strings. Regular expressions are used for:

  • Pattern matching: Finding a match within a string.
  • Pattern replacement: Replacing parts of a string that match a specific pattern.
  • Pattern splitting: Breaking a string into parts based on a pattern.

Regular Expression Syntax

  • . - Matches any single character except newline.
  • ^ - Matches the beginning of a string.
  • $ - Matches the end of a string.
  • [] - Matches any single character within the brackets (e.g., [a-z]).
  • | - Acts as a logical OR (e.g., abc|def).
  • * - Matches zero or more occurrences of the preceding element.
  • + - Matches one or more occurrences of the preceding element.
  • ? - Matches zero or one occurrence of the preceding element.
  • () - Groups patterns together (e.g., (abc)+).
  • \d - Matches any digit (equivalent to [0-9]).
  • \w - Matches any word character (letters, digits, and underscores).
  • \s - Matches any whitespace character (spaces, tabs, line breaks).

Using Regular Expressions in PHP

PHP provides several functions to work with regular expressions, such as preg_match(), preg_replace(), preg_match_all(), and preg_split(). These functions operate on strings and are designed to work with patterns that follow the regular expression syntax.

Example: preg_match()

The preg_match() function searches for a pattern in a string and returns true if the pattern is found.

<?php
$pattern = "/\d+/"; // Matches one or more digits
$string = "The year is 2025";

if (preg_match($pattern, $string)) {
    echo "Match found!";
} else {
    echo "Match not found.";
}
?>

Explanation:

  • /\d+/ is the regular expression pattern to match one or more digits.
  • preg_match() returns true if a match is found, otherwise it returns false.

Example: preg_replace()

The preg_replace() function replaces occurrences of a pattern in a string with a given replacement.

<?php
$pattern = "/\d+/"; // Matches digits
$string = "The year is 2025";
$replacement = "XXXX";

$result = preg_replace($pattern, $replacement, $string);
echo $result; // Output: "The year is XXXX"
?>

Explanation:

  • /\d+/ matches any sequence of digits in the string.
  • preg_replace() replaces the matched digits with XXXX.

Example: preg_match_all()

The preg_match_all() function searches for all occurrences of a pattern in a string.

<?php
$pattern = "/\d+/"; // Matches digits
$string = "The year is 2025 and the month is 02";

preg_match_all($pattern, $string, $matches);
print_r($matches); // Output: Array ( [0] => Array ( [0] => 2025 [1] => 02 ) )
?>

Explanation:

  • preg_match_all() finds all matches of the pattern and returns them in an array.
  • $matches will contain all the digits found in the string.

Example: preg_split()

The preg_split() function splits a string into an array using a regular expression as the delimiter.

<?php
$pattern = "/\s+/"; // Matches one or more spaces
$string = "PHP is great for web development";

$result = preg_split($pattern, $string);
print_r($result); // Output: Array ( [0] => PHP [1] => is [2] => great [3] => for [4] => web [5] => development )
?>

Explanation:

  • preg_split() splits the string at one or more spaces and returns an array of words.

Pattern Modifiers

Modifiers are used to change how the regular expression is processed. They are added after the closing delimiter of the pattern.

Common Modifiers:

  • i - Case-insensitive matching (e.g., /pattern/i).
  • m - Multiline matching. Changes the behavior of ^ and $ to match the beginning and end of lines, not just the string (e.g., /pattern/m).
  • s - Dot matches all, including newlines (e.g., /pattern/s).
  • x - Extended mode, allows for comments and whitespace in the pattern (e.g., /pattern/x).

Example with Modifier:

<?php
$pattern = "/hello/i"; // Case-insensitive search for "hello"
$string = "Hello World";

if (preg_match($pattern, $string)) {
    echo "Match found!";
} else {
    echo "Match not found.";
}
?>

Common Regular Expression Functions

Here are some commonly used PHP regex functions:

preg_match()

  • Searches a string for a pattern.
  • Returns 1 if the pattern is found, 0 if not, or false if an error occurs.

preg_match_all()

  • Finds all matches of a pattern in a string.
  • Returns an array of matches.

preg_replace()

  • Replaces all occurrences of a pattern in a string.

preg_split()

  • Splits a string into an array based on a pattern.

preg_quote()

  • Escapes all special characters in a string, making it safe to use in a regular expression.

preg_last_error()

  • Returns the last error code from a preg_* function.

Examples of Regular Expressions

Example 1: Validate an Email Address

<?php
$email = "[email protected]";
$pattern = "/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/";

if (preg_match($pattern, $email)) {
    echo "Valid email address!";
} else {
    echo "Invalid email address!";
}
?>

Explanation:

  • The pattern ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ checks if the email address is valid.

Example 2: Validate a Phone Number

<?php
$phone = "123-456-7890";
$pattern = "/^\d{3}-\d{3}-\d{4}$/";

if (preg_match($pattern, $phone)) {
    echo "Valid phone number!";
} else {
    echo "Invalid phone number!";
}
?>

Explanation:

  • The pattern ^\d{3}-\d{3}-\d{4}$ validates a phone number in the format 123-456-7890.

Best Practices

  • Sanitize user input: Always sanitize and validate user input before using it in regex, especially for forms and searches.
  • Use non-greedy quantifiers: Use *? and +? to match the smallest possible string instead of the longest possible match.
  • Avoid overly complex patterns: Keep regular expressions simple and readable. Complex patterns can be difficult to debug.
  • Test your regular expressions: Use online tools like regex101 to test and debug your regular expressions.

Conclusion

In this section, we covered:

  • Basics of regular expressions and their syntax.
  • Common regular expression functions such as preg_match(), preg_replace(), preg_match_all(), and preg_split().
  • Examples of using regular expressions to validate emails and phone numbers.
  • Best practices for using regular expressions in PHP.

Regular expressions are a versatile tool in PHP for pattern matching and manipulation of strings. By mastering regex, you can handle complex text processing tasks efficiently.