“Simple” string matching fun with regular expressions

I'm still yet to get my head around the power of regular expressions.

I had a problem where I was trying to match a particular image name, only there were multiple variations.

  • _side.jpg
  • _side_1.jpg
  • _side_2_1.jpg

My regular expression looked like this '/_side*\.jpg$/' but clearly there was a problem with it so I called on a friend to help me understand it. (Thanks Geoff)

The '*' operator means match the preceding character none or more times, so your regex pattern will match (_side.jpg, _sideee.jpg, sid.jpg, etc...).

If all your ever expecting to see after _side are underscores preceding numbers and an extension, then a regex pattern like '/_side(_[0-9]+)+\.jpg$/' should work... What that does is matches _side and an underscore before atleast one '+' numeric value, which could be any combination of numeric characters. The + after the ()'s means match that pattern at least once, but as many times as it appears. Then it just enforces the fact that it ends with a '.jpg', by following the extension with a '$'.

Alternatively, if you want it to be more flexible (ie; '_side' followed by anything, followed by the extension) you could use something like '/_side.*\.jpg$/'. The '.' character matches any character (which is why you need to escape it with a '\' when you want to match an actual '.' character), so following a '.' with a '*' means "match any character any number of times, even none. So that regex pattern will match things like (_side.jpg, _side_01_99.jpg, _sidefoobar.jpg, etc...)

Ahhhh...now I understand.

While on the topic, he's a cool Regex cheat sheet I found. http://www.addedbytes.com/cheat-sheets/regular-expressions-cheat-sheet/