learn regular expressions in php

love them or hate them, regular expressions are here to stay.

when it comes to quickly dealing with large blocks of data, batch processing operations or screen scraping, regular expressions are often the most effective solution. there’s just one problem, though – learning them can be as hard as learning a new language altogether. here’s how to get off to a flying start.

first, you need to how regular expressions work. stick with pcre – the preg_* functions – they’re faster, more reliable and expressions for them are more common. the different styles of regular expressions each define a mini language for matching patterns within a block of text. screen scraping is probably the most popular example. consider this:

not 1.
yet another link.

what if you wanted to programmatically extract the link urls and text from each of these? at this level of simplicty, some basic explode()s might be enough, and when you need to be more tolerant for the data, xml parsing is probably the way to go. but for now, just looking at that, regular expressions would be the easiest way. and we can build a pattern that defines what is to be matched so that the regular expression engine (in this case, pcre) will find all matches and give them to us.

one of the best ways to understand regular expressions in practise, i find, is to consider mod_rewrite. you’ve seen those fantastic urls on websites – e.g. http://example.com/users/myusername – these don’t actually exist as a file, but we can use apache’s mod_rewrite engine to “rewrite” these urls to other urls. we might make that /users/(some username) go to /users.php?username=(the username). the best tutorial for this, by far, is the corz.org htaccess tips and tricks guide. have a very thorough read through this and experiment with it extensively before you go on.

next, get equipped to test out regular expressions. you’re probably using firefox. if you aren’t, get it. then get the regular expressions tester firefox extension. this will be invaluable for experimenting with regular expressions – the best way to learn.

of course, you’ll need to be familiar with the php functions that will provide your regular expression workhorse. bookmark the preg_match function’s manual entry and refer back to it as you need to. the manual page has some invaluable examples which you should read through extensively.

finally, when in doubt, keep a cheatsheet at hand. ilovejackdaniels.com has a fantastic regular expression cheatsheet that you should also bookmark.

with all these ready, you’ll be off to a great start. remember, the best way to learn is to practise — read through real, working code, then write your own. once you’ve finished all this, you’ll be a regular expression whiz in no time.