javascript - How get links from href property using regex -
i have regex
expression returns me links html file, has problem: instead of returning link, http://link.com
, returns href=" (href="http://link.com
). can links
without having href="
?
this regex:
/href="(http|https|ftp|ftps)\:\/\/[-a-za-z0-9.]+\.[a-za-z]{2,3}(?:\/(?:[^"<=]|=)*)?/g
full code:
var source = (body || '').tostring(); var urlarray = []; var url; var matcharray; // regular expression find ftp, http(s) urls. var regextoken = /href="(http|https|ftp|ftps)\:\/\/[-a-za-z0-9.]+\.[a-za-z]{2,3}(?:\/(?:[^"<=]|=)*)?/g; // iterate through urls in text. while( (matcharray = regextoken.exec( source )) !== null ) { var token = matcharray[0]; token = json.stringify(matcharray[0]); token = matcharray[0].tostring(); urlarray.push([ token ]); }
regexp#exec
store contents captured capturing groups defined in pattern. may access group 1 [1]
index.
use
var token = matcharray[1];
also, believe can shorten regex
/\bhref="((?:http|ftp)[^"]+)"/g
if sure values inside double quotes. see this demo.
Comments
Post a Comment