Regex to find URLs in Web Pages

This is a really difficult thing to do, but I’m surprised that the top few links I found googling for solutions weren’t better. In fact they sucked. Here’s what I whipped up in a few minutes:


Now, the code I tested it on was from the top link to my first search. (A+ for SEO, D– for actual code.) To begin with, not capturing src URLs is boneheaded. But the big problem dealing with this page is, first of all, sample code (which is why the filters for unquoted text have some odd characters in there) and JavaScript URL builders. No real way to deal with those because JavaScript expressions can be built entirely of legal URL characters.