Unshortening URLs in PHP

I had cause to unshorten a ton of links from twitter today.  There are loads of services that do it for you (and the popular shorteners have APIs) but I couldn’t see any existing PHP libraries to do it.  Below are some fairly straightforward PHP functions to unshorten URLs based on the Location: header that’s used in redirects.  The “isShort” function is a bit of a hack because I was only interested in a few domains, but should be straightforward to modify it for other tasks.

Requires curl support in PHP.


/**
 * Get the value of the Location header obtained when dereferencing the given URL. False if there isn't one
 */
function getLocation($url)
{
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_HEADER, true);

$data = curl_exec($ch);
curl_close($ch);

list($headers, $body) = explode("rnrn", $data, 2);

$headers = explode("n", $headers);
foreach($headers as $h)
{
if(preg_match('@Location:(.*)@i', $h, $match))
{
return $match[1];
}
}

return false;
}

function is_short($url)
{
return preg_match('@^https?://(www.)?(bit.ly|t.co|goo.gl|dlvr.it|tl.gd|is.gd)@', $url);
}

/**
 * Unshorten a short URL until it isn't short anymore (copes with URLs that have been 
 * shortened multiple times, up to $limit).
 * 
 * Returns false (by virute of getLocation() ) if the URL isn't short
 */
function unshorten($inurl, $limit = 5)
{
$i = 0;
$url = $inurl;
while(is_short($url) && $i < $limit)
{
$url = getLocation($url);
$i++;
}

return $url;
}