Inefficient Regular Expression Complexity in re
Module
Patterns in Python's re module that are susceptible to catastrophic backtracking. Such patterns can lead to performance issues and may cause a Denial-of-Service (DoS) condition in applications by consuming an excessive amount of CPU time on certain inputs Vulnerability Explanation
Catastrophic backtracking occurs in regex evaluation when the engine tries to match complex patterns that contain nested quantifiers or ambiguous constructs. In certain cases, especially with maliciously crafted input, this can lead to an exponential number of combinations being checked, severely impacting application performance and potentially causing it to hang or crash.
Example
import re
pattern = re.compile('(a+)+$')
result = pattern.match('aaaaaaaaaaaaaaaaaaaa!')
Remediation
When using Python's re module to compile or match regular expressions, ensure that patterns are designed to avoid ambiguous repetition and nested quantifiers that can cause catastrophic backtracking. Regular expressions should be reviewed and tested for efficiency and resistance to DoS attacks.
import re
pattern = re.compile('a+$')
result = pattern.match('aaaaaaaaaaaaaaaaaaaa!')
False Positives
In the case of a false positive the rule can be suppressed. Simply add a
trailing or preceding comment line with either the rule ID (PY033
) or
rule category name (regex_denial_of_service
).
- Using rule ID
- Using category name
import re
# suppress: PY033
pattern = re.compile('(a+)+$')
result = pattern.match('aaaaaaaaaaaaaaaaaaaa!')
import re
# suppress: regex_denial_of_service
pattern = re.compile('(a+)+$')
result = pattern.match('aaaaaaaaaaaaaaaaaaaa!')