Tutorials Logic
Tutorials Logic, IN info@tutorialslogic.com
Navigation
Home About Us Contact Us Blogs FAQs
Tutorials
All Tutorials
Services
Academic Projects Resume Writing Website Development
Practice
Quiz Challenge Interview Questions Certification Practice
Tools
Online Compiler JSON Formatter Regex Tester CSS Unit Converter Color Picker
Compiler Tools

Regular Expressions in Python — re Module Guide

What are Regular Expressions?

A regular expression (regex) is a pattern used to match, search, and manipulate text. Python's re module provides full regex support.

re Module Functions

FunctionDescription
re.match(pattern, string)Match at the beginning of string
re.search(pattern, string)Search anywhere in string
re.findall(pattern, string)Return all matches as a list
re.finditer(pattern, string)Return iterator of match objects
re.sub(pattern, repl, string)Replace matches with repl
re.split(pattern, string)Split string by pattern
re.compile(pattern)Compile pattern for reuse
Basic Functions
import re

text = "The price is $25.99 and $10.50"

# search - find first match anywhere
match = re.search(r"\d+\.\d+", text)
if match:
    print(match.group())   # 25.99
    print(match.start())   # 14 (start index)
    print(match.end())     # 19 (end index)

# findall - find all matches
prices = re.findall(r"\$\d+\.\d+", text)
print(prices)   # ['$25.99', '$10.50']

# sub - replace matches
clean = re.sub(r"\$\d+\.\d+", "[PRICE]", text)
print(clean)    # The price is [PRICE] and [PRICE]

# split - split by pattern
sentence = "one,two;three four"
parts = re.split(r"[,; ]+", sentence)
print(parts)    # ['one', 'two', 'three', 'four']

Regex Pattern Syntax

PatternMatchesExample
.Any character (except newline)a.c -> "abc", "a1c"
^Start of string^Hello
$End of stringworld$
*0 or moreab* -> "a", "ab", "abb"
+1 or moreab+ -> "ab", "abb"
?0 or 1 (optional)colou?r -> "color", "colour"
{n}Exactly n times\d{4} -> "2024"
{n,m}Between n and m times\d{2,4}
[abc]Any of a, b, c[aeiou]
[^abc]Not a, b, or c[^0-9]
\dDigit [0-9]\d+ -> "123"
\DNon-digit
\wWord char [a-zA-Z0-9_]\w+
\WNon-word char
\sWhitespace\s+
\SNon-whitespace
\bWord boundary\bword\b
(abc)Capture group(\d+)-(\d+)
a|ba or bcat|dog

Groups and Capturing

Groups
import re

# Capture groups with ()
date_str = "Today is 2024-06-15"
match = re.search(r"(\d{4})-(\d{2})-(\d{2})", date_str)
if match:
    print(match.group(0))  # 2024-06-15 (full match)
    print(match.group(1))  # 2024 (year)
    print(match.group(2))  # 06   (month)
    print(match.group(3))  # 15   (day)

# Named groups
match = re.search(r"(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})", date_str)
if match:
    print(match.group("year"))   # 2024
    print(match.group("month"))  # 06
    print(match.groupdict())     # {'year': '2024', 'month': '06', 'day': '15'}

# findall with groups returns list of tuples
text = "John: 25, Alice: 30, Bob: 22"
results = re.findall(r"(\w+): (\d+)", text)
print(results)  # [('John', '25'), ('Alice', '30'), ('Bob', '22')]

Practical Examples

Real-World Patterns
import re

# Email validation
def is_valid_email(email: str) -> bool:
    pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
    return bool(re.match(pattern, email))

print(is_valid_email("user@example.com"))   # True
print(is_valid_email("invalid-email"))      # False

# Phone number extraction
text = "Call us at 555-123-4567 or (800) 555-0199"
phones = re.findall(r"[\d\-\(\) ]{10,}", text)
print(phones)

# URL extraction
html = '<a href="https://example.com">Link</a> and <a href="http://test.org">Test</a>'
urls = re.findall(r'https?://[^\s"]+', html)
print(urls)   # ['https://example.com', 'http://test.org']

# Password strength check
def check_password(pwd: str) -> dict:
    return {
        "length": len(pwd) >= 8,
        "uppercase": bool(re.search(r"[A-Z]", pwd)),
        "lowercase": bool(re.search(r"[a-z]", pwd)),
        "digit": bool(re.search(r"\d", pwd)),
        "special": bool(re.search(r"[!@#$%^&*]", pwd)),
    }

result = check_password("MyPass123!")
print(result)

# Compile for reuse (faster when used many times)
email_re = re.compile(r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$")
emails = ["a@b.com", "bad", "x@y.org"]
valid = [e for e in emails if email_re.match(e)]
print(valid)  # ['a@b.com', 'x@y.org']

Ready to Level Up Your Skills?

Explore 500+ free tutorials across 20+ languages and frameworks.