Search  
Thursday, August 28, 2008 ..:: Perl Wiki ::.. Register  Login
History for Using Regular Expressions to Parse HTML Part 5 (history as of 04/25/2007 18:49:58)

In Part 1 we had a quick look at what Perl and regular expressions
are, and introduced the idea of using them to process HTML files. In
Part 2 we developed a Perl script to process a single HTML file. In
part 3 we looked at one way of processing multiple files. In Part 4 we
looked at how to read in all the files in the current directory. In
this, the last part, we'll look at how to read in specific files in
specific directories.

In Part 4 we wrote a script that enabled us
to read in all the files in the current directory. Sometimes, however,
you might need to process files that are located in different
directories. script4.pl lists a script that will do this.

Note:
Due to display considerations, in the example code shown in this
article, square brackets '[..]' are used in HTML/script tags instead of
angle brackets '<..>'.

script4.pl

1 @allfiles=glob("file1.htm directory1/subdirectory1/*.shtm directory2/*.htm");

2 foreach $name (@allfiles) {

3 rename $file, "$file.bak";

4 open (IN, "<$file.bak");

5 open (OUT, ">$file");

6 while ($line = [IN]) {

7 $line =~ s/[h1]/[h1 class="big"]/;

8 (print OUT $line);

9 }

10 close IN;

11 close OUT;

12 }

The
only new line here is line 1, which uses the glob function to search
through specified directories and files. Firstly, it searches for
file1.htm in the current directory, and then it search for all files
ending in .shtm in directory1/subdirectory1, and then all files ending
in .htm in directory2. The asterisk (*) is a wildcard, which means any
filename.

Running the script

c:>perl script4.pl




About
the Author: John Dixon is a web developer and technical author. These
days, John spends most of his time developing dynamic database-driven
websites using PHP and MySQL.


Go to http://www.computernostalgia.net to view one of John's sites. This site contains articles and photos relating to the history of the computer.


To find out more about John's work, go to http://www.dixondevelopment.co.uk.

Article Source: http://EzineArticles.com/?expert=John_Dixon

  

|<< Back |    

Copyright 2007 by Perl Pages Forum   Terms Of Use  Privacy Statement