Corrections to Auto-print Regex

Several months ago I published a regular expression that would strip auto-print links from webpages. Unfortunately, the regex doesn’t actually catch everything, so here’s a corrected version:

replace(/((window|self)\.print(\(\))?|print\(\));?/ig, "");

This will catch all the following:

print()
print();
self.print
self.print()
self.print();
window.print
window.print()
window.print();

Further, the addition of the g after the / means all instances of the print command will be replaced, rather than just the first, and the i means this will work regardless of whether or not the command is written entirely in lowercase text.

I’ve also noticed a separate issue that I’m still not sure how to resolve — if the rule is set to run on all content-types, it will blindly strip out instances regardless of whether they’re actually operational JavaScript (e.g. the above ‘all the following’ section); if it is set to only run on the JavaScript content-type it will miss any instances that don’t appear in a JavaScript file.

I suppose an even more complicated variation could be made that checks for things like javascript: or <script>…</script>, but it really begins to spiral out of control. If anybody has a solution for this problem, I’d love to know.

written 27 November, 02009 Comments