Negation in regular expressions

How to search for lines that don’t contain a particular pattern is fairly easy in some programs, obscure in others, and almost impossible yet in others. That is, assuming your program of choice supports regular expressions. I review how to achieve this functionality in Perl, Vim, grep, and vi.

Regexp negation in Perl

Perl has many non-standard features supported in the regular expression matching. One of these features in the lookahead (?pattern). Furthermore, the lookahead function support negation (?!pattern). To search in Perl for lines that contain foo but are not followed by bar anywhere on the line:

foo(?!.*bar)

Regexp negation in Vim

Vim’s regular expression matching is more flexible that vi’s. Particularly, vim supports the syntax @!. You can find a full description of this feature in vim by typing :help /@!. To search in Vim for lines that contain foo but are not followed by bar anywhere on the line:

foo(.*bar)@!

Regexp in grep

Grep doesn’t support negation of patterns in a regular expression, but it does support two other classes of negation: negation of character classes and negation of matches. Negation of character classes is standard in regular expression matching and the syntax is typically [^abc], which mean match any character other than a, b, and c. Negation of matches is accomplished using the command line option -v. To search in grep for lines that contain foo but are not followed by bar anywhere on the line:

grep -o 'foo.*' file | grep -v 'bar'

Note that this works only in GNU grep and not the standard UNIX grep, because -o is a GNU grep extension that does not exist in UNIX grep.

Regexp negation in vi, ed, and UNIX grep

Basic regular expressions (BRE) and extended regular expressions (ERE) contain only one form of negation: character class negation. This makes it a lot more difficult to construct patterns that include pattern negation. But it is not impossible. One way to achieve this is to first construct a DFA for you pattern, write the formula’s describing the DFA, and then consolidate them into a regular expression. For simplicity, the example below is written in ERE. To search in ERE for lines that contain foo but are not followed by bar anywhere on the line:

foo([^b]|(b(b|(ab))*([^ba]|(a[^br]))))*((b(b|(ab))*a)|(b(b|(ab))*)|$)$

As demonstrated, although negation of pattern can be hard in plain regular expressions, it is not impossible as some might think.

2 Responses to “Negation in regular expressions”

  1. magairlin says:

    Thanks, the idea with DFA construction helped me solve my problem.

  2. johanrex says:

    The help for the vim syntax is not
    :help /@!
    but
    :help \@!

Leave a Reply