WAY2WEB: Web Design & business...
Link Extractor
When building certain web applications, you need to search through the entire page of source code to find all of the outgoing links. This can be a tricky task to wrap your head around - where to start?!?
Well, it isn't so hard, but to help you along, I've written the following script. Take a look and see what you think: (it's in PHP)
<?php
function hyperlinkextract($s1,$s2,$s){
$myarray=array();
$s1=strtolower($s1);
$s2=strtolower($s2);
$L1=strlen($s1);
$L2=strlen($s2);
$scheck=strtolower($s);
do{
$pos1 = strpos($scheck,$s1);
if($pos1!==false){
$pos2 = strpos(substr($scheck,$pos1+$L1),$s2);
if($pos2!==false){
$myarray[]=substr($s,$pos1+$L1,$pos2);
$s=substr($s,$pos1+$L1+$pos2+$L2);
$scheck=strtolower($s);
}
}
} while (($pos1!==false)and($pos2!==false));
return $myarray;
}
$content = file_get_contents('http://www.way2web.net/');
$myarray = hyperlinkextract("href=\"","\"",$content);
// Process all the links
foreach($myarray as $key => $val) {
echo "<br />".$val."\n";
}
?>
Feel free to use this for whatever you want. A link back to this site would be nice, but I won't impose that limitation on you. Just enjoy this little function.
