How to Match Up Until First Occurrence of Regex Pattern
Let’s see how we can match up until the first occurrence of a pattern in a regular expression.
Suppose we’re working with the text below.
Problem: greedy dot-star regex (
Suppose we want to match
abc!, so we naturally test the following regular expression:
But, this matches the entire line.
Solution 1: non-greedy dot-star regex (
In order to stop matching after the first occurrence of a pattern, we need to make our regular expression lazy, or non-greedy.
Inside the capture, we can include a question mark
This will match until the first occurrence of the succeeding pattern.
?on any quantifier (
+) will make it non-greedy. Keep in mind that this
?is only available in regex engines that implement Perl 5 extensions (e.g. Java, Python, Ruby). In traditional engines like
-P, we’ll have to resort to the next method.
Solution 2: match all but exclude (
Another way to avoid matching after the first occurrence is to exclude characters in the capture.
We can do this using the caret
^ inside a set of brackets
[^abc] will match any character except for
[^abc]* will match any number of characters excluding
Inside the capture, we can exclude the succeeding pattern, which is the exclamation mark
! in this case.
This will match only the first occurrence of the pattern.
With most regex engines,
[^!]*is likely faster than
.*?since it does not need to look up the pattern after the current pattern. That being said,
.*?is a more generic pattern than can be applied to any regular expression.