Check to see if a page links to your URL or mentions your brand.

With this script you can input a URL to check and a keyword to check (optional) to see whether they’re included on a particular page. Perfect for reviewing unlinked brand mentions or for checking to see whether pages on your site link to a target page. 🚀

Check out the GIF to see it in action 🎥


How to add the Script to Google Sheets

1. Copy the script below:

/**
 * Check to see if a domain, path, and/or mention exists on a page. 
 *
 * @param {"https://example.com/example-page"} host - Input the host URL.
 * @param {"https://test.com/url-to-search"} url - Input the URL you want to check.
 * @param {"brand|anotherMention|..."} mentions - Input strings separated by '|' to check for mentions.
 * @param {"body"} area - [OPTIONAL] Input the area of HTML you want to search. Default is 'body'.
 * @customfunction
 */
function linkChecker(host, url, mentions, area) {
  // Check for required parameters
  if (!host || !url || !mentions) {
    return ["All parameters (host, url, mentions) are required."];
  }

  // Set default area if not provided
  area = area || "body";

  // Split mentions string and convert each mention to lowercase
  mentions = mentions.split('|').map(mention => mention.toLowerCase());

  // Normalize the URL to ensure it starts with a protocol (default to HTTPS)
  const normalizeUrl = (url) => {
    if (!url.match(/^https?:\/\//i)) {
      url = 'https://' + url; 
    }
    return url;
  };

  // Remove the protocol from a URL
  const stripProtocol = (url) => {
    return url.replace(/^https?:\/\//i, '');
  };

  // Construct an absolute URL from a base and relative path
  const absoluteUrl = (base, relative) => {
    const stack = base.split("/");
    const parts = relative.split("/");
    stack.pop();
    for (let i = 0; i < parts.length; i++) {
      if (parts[i] === ".") continue;
      if (parts[i] === "..") stack.pop();
      else stack.push(parts[i]);
    }
    return stack.join("/");
  };

  host = normalizeUrl(host);
  const strippedTargetUrl = stripProtocol(url);

  try {
    const fetchOptions = {
      'followRedirects': true,
      'muteHttpExceptions': true
    };

    // Fetch the content from the provided host
    const doc = UrlFetchApp.fetch(host, fetchOptions).getContentText();
    
    // Load the fetched content into Cheerio for parsing
    const $ = Cheerio.load(doc);

    // Extract content from the specified area and convert it to lowercase
    const content = $(area).text().toLowerCase();

    let linkFound = false;
    let linkAttributes = "No link";

    // Check for presence of the target link in the content
    $(area).find('a').each(function (i, elem) {
      const href = $(this).attr('href');
      const absoluteHref = absoluteUrl(host, href);
      const strippedLinkUrl = stripProtocol(absoluteHref);

      if (strippedLinkUrl.includes(strippedTargetUrl)) {
        linkFound = true;
        const attrs = $(this).attr();
        linkAttributes = Object.keys(attrs).map(key => `${key}: ${attrs[key]}`).join(', ');
        return false; // Exit loop once link is found
      }
    });

    // Check for presence of each mention in the content
    const mentionStatuses = mentions.map(mention => {
      return content.includes(mention) ? "Found: " + mention : "Not found: " + mention;
    });

    return [[linkFound ? "Link found" : "Link not found", ...mentionStatuses, linkAttributes]];

  } catch (e) {
    return ["Website failed"];
  }
}

2. Head over to Google Sheets

Or if you’re really smart, create a new sheet by going to: https://sheets.new

Select Script editor from the Tools menu.

Paste the script and save it.


3. Add the Cheerio Library in Apps Script

Search for the Cheerio Library using the ID below. The Cheerio Library makes it easier to parse the HTML from the requested page.

1ReeQ6WO8kKNxoaA_O0XEQ589cIrRvEBA9qcWpNqdOP17i47u6N9M5Xh0

If you’re not sure how to add a library in Apps Script, follow the gif below:


4. Add the formula to any cell in your sheet

=linkChecker(A1,A2,A3,"body")
  • Replace A1 with any cell that includes your host page (the page you want to crawl).
  • Replace A2 with any cell that includes the link you’d like to check.
  • Replace A3 with any cell that includes the keyword you’d like to check (this could your brand name for example) – this is optional. You can also include multiple keywords separated with ‘|’. For example keyword1|keyword2.
  • Replace “body” which is the default area for searching, with any area of the website you want to search. For example “section”.

My link checker script will search for a relative path if you input a page, rather than a domain. So for example, let’s say you input ‘https://example.com/page’, it will search for ‘/page’ as many internal links are often relative.

If you input a domain, say “https://example.com/”, it will search for ‘example.com/’ to capture all protocols that a referring page might use.

*Websites could fail when fetching because of bot detection algorithms.


Thanks for stopping by 👋

I’m Andrew Charlton, the Google Sheets nerd behind Keywords in Sheets. 🤓

Questions? Get in touch with me on social or comment below 👇


More Scripts

Submit your response

Your email address will not be published. Required fields are marked *

11 Responses on this post

  1. Thanks for all the scripts Andrew!

    This is great but I’m running into a trailing slash issue…
    If I add the slash to the “URL to check” (e.g. https://example.com/), it doesn’t pick up when a page has the link without it (https://example.com).
    And if I remove it from the URL to check, the script seems to say all pages include the link (which I know they don’t).

    Any help much appreciated and keep up the good work!

    1. Ahh this is a bug I already knew about but thought it would be an edge case. I’ll see if I can fix that for you! 🙂

      1. Okay, I’ve now made a change to the code and template to try and fix this. Could you re-test please? 🙂

  2. Hi Andrew

    Just tried the script off to find potential internal link options.

    Unfortunately, it seems that the script also checks for our menu which is why we end up getting mentions and links on almost all urls.

    1. Hmm yeah that sucks! So if your website is using semantic HTML and your content is contained within a

      you can change the line: const body = $(‘body’).html().toLowerCase(); to const body = $(‘section’).html().toLowerCase(); and it will only extract the body content. Hope that helps! 🙂
        1. I can see the comment-field has removed some of my comment cause of HTML. I meant to write “P tags” 🙂

        1. Great, I might add an argument into the function to allow you to specify where to look for links 🙂