Skip to content

Vulnerable Patterns

This guide shows common vulnerable regex patterns, explains why they're dangerous, and provides safe alternatives.

Pattern Categories

1. Nested Quantifiers

The most common source of exponential complexity.

Vulnerable

r"^(a+)+$"      # Exponential O(2^n)
r"^(a*)*$"      # Exponential O(2^n)
r"^(a+)*$"      # Exponential O(2^n)
r"^([a-z]+)+$"  # Exponential O(2^n)

Safe Alternative

r"^a+$"         # Linear O(n)
r"^[a-z]+$"     # Linear O(n)

Why it's dangerous: The inner quantifier creates ambiguity about how to distribute characters among repetitions.

from redoctor import check

result = check(r"^(a+)+$")
print(result.complexity)  # O(2^n)
print(result.attack)      # 'aaaaaaaaaaaaaaaaaaa!'

2. Overlapping Alternatives

Alternatives that can match the same input.

Vulnerable

r"(a|a)+$"      # Same character
r"(a|ab)+$"     # Overlapping
r"(.*|a)+$"     # Wildcard overlap
r"(\w+|\d+)+$"  # \d is subset of \w

Safe Alternative

r"a+$"          # Remove redundancy
r"(ab?)+$"      # Combine alternatives

Why it's dangerous: Multiple paths can match the same input, causing backtracking.


3. Greedy Wildcards

Multiple .+ or .* patterns.

Vulnerable

r".*a.*a.*"     # O(n²)
r".*a.*a.*a.*"  # O(n³)
r".+x.+x.+$"    # O(n³)

Safe Alternative

r"[^a]*a[^a]*a.*"  # Use negated class
r".*?a.*?a.*"      # Lazy quantifiers (still risky)

Why it's dangerous: Each .* can consume varying amounts, creating combinatorial explosion.


4. Email Patterns

Email validation is a common source of ReDoS.

Vulnerable

r"^([a-zA-Z0-9]+)*@"
r"^[\w.]+@[\w.]+$"
r"^([a-zA-Z0-9_.+-]+)+@"

Safe Alternative

r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
# Or use a proper email validation library

5. URL Patterns

URL parsing with regex is tricky.

Vulnerable

r"^(https?://)?([a-z0-9]+\.)+[a-z]{2,}/?$"
r".*\.(com|org|net).*"

Safe Alternative

# Use urllib.parse instead
from urllib.parse import urlparse
parsed = urlparse(url)

6. HTML/XML Patterns

Don't parse HTML with regex!

Vulnerable

r"<(.+)>.*</\1>"
r"<[^>]+>[^<]*</[^>]+>"

Safe Alternative

# Use a proper HTML parser
from html.parser import HTMLParser
# Or use BeautifulSoup, lxml, etc.

7. Whitespace Handling

Trimming and normalizing whitespace.

Vulnerable

r"^\s*(.+?)\s*$"
r"(\s+)+"

Safe Alternative

# Use string methods
text.strip()
" ".join(text.split())

Complexity Reference

Pattern Complexity Risk
^(a+)+$ O(2^n) 🚨 Critical
(a\|a)+ O(2^n) 🚨 Critical
.*a.*a.* O(n²) āš ļø High
(a+)+b O(2^n) 🚨 Critical
^[a-z]+$ O(n) āœ… Safe
^\d{1,10}$ O(n) āœ… Safe

Test with ReDoctor

from redoctor import check

patterns = [
    r"^(a+)+$",
    r"(a|a)*$",
    r".*a.*a.*",
    r"^[a-z]+$",
]

for pattern in patterns:
    result = check(pattern)
    status = "🚨 VULN" if result.is_vulnerable else "āœ… SAFE"
    complexity = result.complexity.summary if result.complexity else "N/A"
    print(f"{status} {complexity:8} {pattern}")

Output:

🚨 VULN O(2^n)   ^(a+)+$
🚨 VULN O(2^n)   (a|a)*$
🚨 VULN O(n^2)   .*a.*a.*
āœ… SAFE O(n)     ^[a-z]+$

Quick Reference Card

Avoid These Patterns

Pattern Problem
(x+)+ Nested quantifiers
(x\|x)+ Overlapping alternatives
.*x.*x.* Multiple wildcards
(x*)* Star within star
(x+x+)+ Overlapping within group

Safe Alternatives

Instead of Use
(a+)+ a+
(a\|ab)+ (ab?)+
.*a.* [^a]*a.*
(\w+\s+)+ (\w+\s)+ or validate differently

Next Steps