2008-02-11

Sed: add space into an regular expression

I am writing a script that compresses web files in order to reduce the size of a site and load on the server. After stripping out leading and trailing spaces I remove newlines and this causes problem with PHP files, since I get something like this:


<?php...

where ... represents some PHP code. This is not valid. The following is needed - note the space before ...:


<?php ...

I am writing a script in Bash shell using sed for string conversions. I am sure Perl, Python or other scripting languages would be a better choice, but I am not familiar with them - even with sed I only scratch the surface. I did not know how to insert a space into an regex with sed, so I came up with the following solution:



sed \

-e "s/<?php[^ ]/jabudabu&/g" \

-e "s/jabudabu<?php/<?php\ /g" \

-e "s/<?[^(php) ]/jabudabu&/g" \

-e "s/jabudabu<?/<?\ /g"

3 comments:

arun kumar said...

yes it works i think.
because i used in javascript and wrote a expression like ^[a-z ]+[a-z]$
It takes characters only and allows space b/w them.

Asad Naeem said...

I want to check spaces too in URL

Asad Naeem said...

Javascript:
var reg = /^([a-zA-Z0-9_$*!'\-\.\']+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4})(\]?)$/;

C#
string reg = @"^([a-zA-Z0-9_$*!'\-\.\']+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4})(\]?)$";