Skip to navigation

PHP preg Pattern Pitfall

If you’re a PHP developer you probably use the regular expression functions a lot, but many miss a minor detail that means their code isn’t quite doing what they expected.

For example, let’s say you wanted to use a regexp to check that a variable contains ‘abc’ and nothing more:

if (preg_match('/^abc$/', $someString)) …

If that seems obvious and bullet-proof to you then you might need to revisit your code. You see, by default the $ metacharacter not only matches the end of the string, but also immediately before the end if the last character is a newline. So in this example, the message will get displayed, despite the string containing an additional unwanted character:

$someString = "abc\n";
if (preg_match('/^abc$/', $someString)) echo 'Valid';

Even in cases where this quirk in matching/replacing strings won’t have direct security implications (in the form of email injection or response splitting), it could still allow invalid data to sneak in and cause puzzling glitches. The solution is to always use the D pattern modifier, which forces $ to only match at the end, e.g. the message won’t get displayed in this case:

$someString = "abc\n";
if (preg_match('/^abc$/D', $someString)) echo 'Valid';

Filed under: Hints and Tips, Security and Privacy, Server-side Coding, Web


Comments


Comments are now closed for this entry.

Malevolent Design Weblog

Matt Round’s company blog, covering web development, media, technology and pretty much anything else.

Blatant self-promotion

Web Sites
Good-looking, effective, accessible sites.
Multimedia
Logos, Flash games, animation and illustration.
Advice
Help with strategy, planning and getting noticed.