Skip to content Skip to sidebar Skip to footer

Js - Regex For Finding Urls In Body Text Not Working

I am trying to implement regex that I found here. I would like to find any http, https or web a tags and then just add target='blank' to them. So, the code looks like this: const u

Solution 1:

When dealing with HTML, try to avoid parsing string. You can try something like this:

Logic:

  • Create a dummy element to work on. This will be an in-memory element and will not be rendered.
  • Set html string as its innerHTML.
  • Fetch any element that can have url in it like a or img.
  • Loop on this list and check for regex validity on necessary attribute.
  • If satisfied, add attribute.

functiongetUpdatedHTMLString(htmlString){
  var urlRegex = /(((https?:\/\/)|(www\.))[^\s]+)/g;
  var dummy = document.createElement('div');
  dummy.innerHTML = htmlString;
  
  var list = dummy.querySelectorAll('a, img');
  for(var i = 0; i< list.length; i++) {
    var href = list[i].getAttribute('href');
    var src = list[i].getAttribute('src');
    if (urlRegex.test(src) || urlRegex.test(href)) {
      list[i].setAttribute('target', '_blank');
    }
  }
  
  return dummy.innerHTML;
}

var str = "<p>" +
"<a href='www.norden.org'>Nordens</a>" +
"</p>" +
"<p>" +
"<figure>" +
"<img src='http://tornado-node.net/wp-content/uploads/2017/08/Ove-Hansen.jpg' alt=' Styreleder Ove Hansen. Foto: Arne Walderhaug' />" +
"<figcaption>Ove Hansen, styreleder i Norden</figcaption>" +
"</figure>" +
"</p>" +
"<p>" +
"<a href='http://norden.org/documents.html'>norden.org</a>" +
"</p>";

console.log(getUpdatedHTMLString(str));
<p><ahref='www.norden.org'>Nordens</a></p><p><figure><imgsrc='http://tornado-node.net/wp-content/uploads/2017/08/Ove-Hansen.jpg'alt=' Styreleder Ove Hansen. Foto: Arne Walderhaug' /><figcaption>Ove Hansen, styreleder i Norden</figcaption></figure></p><p><ahref='http://norden.org/documents.html'>norden.org</a></p>

Post a Comment for "Js - Regex For Finding Urls In Body Text Not Working"