[Tfug] finding linked pages

Robert Hunter hunter at tfug.org
Sat May 3 12:17:39 MST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, May 01, 2008 at 06:57:36PM -0700, christopher wrote:
> Well, I think I'll just have to go through and do it
> manually. I thought there might be something that did
> like a web crawler would, and just keep a list of all
> linked files. I'm mostly done now any way, but I just
> thought it would be a good thing to know the next time
> around. Thanks everyone ~ Chris

Chris, I can understand the need to "get-r-done", but IMO ultimately
you'll sleep better at night when you don't have to take such drastic
measures. ;-) I took Chris Hill's suggestion, and started reading
about Apaches's mod_rewrite.  It's not exactly intuitive at first, but
the possibilities are very cool.  Here's what I came up with on short
notice: an .htaccess file (per directory rules, but you can extend it
to a per-server context), and a small PHP script.  I'm no Apache or
PHP, guru, BTW, so this is sure to be buggy in some way or other.

Hope it helps...

<file name=".htaccess">
<IfModule mod_rewrite.c>
rewriteEngine On
rewriteBase /
rewriteCond %{REQUEST_FILENAME} !display.php
rewriteCond %{REQUEST_FILENAME} html$
rewriteRule ^(.*) /~rhunter/display.php?r=%{REQUEST_FILENAME}
</IfModule> 
</file>

<file name=display.php>
<?

$doc = new DOMDocument();
$doc->loadHTMLFile( $_GET["r"] );

$body  = $doc->getElementsByTagName('body')->Item(0);
$first = $body->firstChild;
$img   = $doc->createElement('img');
$img->setAttribute( 'src', '/path/to/your/banner.png' );
$img->setAttribute( 'align', 'middle' );
$img->setAttribute( 'alt', 'banner.png' );
$body->insertBefore( $img, $first );

echo $doc->saveHTML();

?>
</file>

- --Rob
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFIHLpTJ1pz6tWxufARAjfeAJ0WfEx5R/cUULQFodzhdKinLQ4nDgCfb6r7
IvjexHpnXlB/HBvyZcaRh8E=
=pDDC
-----END PGP SIGNATURE-----




More information about the tfug mailing list