Google Index Checker

Get the Google Index status of a list of URLs using Google Sheets and Apify. ~1,500 requests per month, for free. 🚀

Check out the GIF to see it in action 🎥


How to use Google Sheets template

1. Sign up to Apify

Apify offers an easy-to-use platform for anyone to build reliable web scrapers. You don’t need to be a developer – it’s super simple to get started!

I’ve chosen Apify as the running costs are pretty cheap and you get $5 free every month. This will give you around 1,500 indexing checks every month FOR FREE.

First visit the Google Search Scraper agent on Apify and click Try for free.

You’ll be greeted with a form you can you can register using your Google account, GitHub account, or an email address.


2. Get your API Token

Once you’re all logged in, you can get your API token by visiting settings -> integrations.

You need an Apify API token to run the scraper from Google Sheets. The API token allows Google Sheets to communicate with Apify.

Copy the API token.


4. Add your API token to the template

Copy the template if you haven’t already done so, and then add your API token to cell D14 in the ‘Start Here’ sheet.

This is the only thing you need to do in the sheet to get started.


5. Run the script by heading to the ‘Index Checker’ menu and selecting ‘Check URLs’

To run the script you’ll need to make sure:

1. You have three headings: URLs (A1), Index Check (B1) and Status (C1)

2. You add all of your URLs in Column A, beneath the URLs heading.

See the gif at the beginning of the guide for more detail.

The script checks the index status of URLs by querying Google with the site: operator. eg. site:example.com/subfolder/page

It’s certainly not perfect. This is how Google describes it:

The site: operator doesn’t necessarily return all the URLs that are indexed under the prefix specified in the query. Keep this in mind if you want to use the site: operator for tasks like identifying how many URLs are indexed and serving under a prefix.

Source: https://developers.google.com/search/docs/advanced/debug/search-operators/all-search-site

If you do want to accurately define the index status, you’d need to query each URL manually using the URL Inspection Tool (but you know this already, right 😉)


Add the script to your own templates

Want to copy the script to your own templates? Here you go.

Couple of pointers:

1. The batch size is set to 10 by default (the amount of URLs Apify scrapes at once). Apify can likely run more than this in one batch (and it even might be cheaper), you can change this in the script where I’ve commented (line 53)

2. The API Token is being extracted from the ‘Start Here’ sheet in cell D14. If you want to change this, you will need to edit the getConfig function.

// Add custom menu on open
function onOpen() {
  const ui = SpreadsheetApp.getUi();
  ui.createMenu('Index Checker')
    .addItem('Check URLs', 'indexchecker')
    .addToUi();
}

// Retrieve API token from 'Start Here' sheet within the template
function getConfig() {
  const configSheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Start Here");
  const API_TOKEN = configSheet.getRange("D14").getValue();
  if (API_TOKEN === '') {
    throw new Error('You need to enter your Apify API Token in the "Start Here" Sheet');
  }
  return { API_TOKEN };
}

// Get the API token and build API URL
const config = getConfig();
// Build the API URL using the retrieved token
const API_URL = "https://api.apify.com/v2/acts/apify~google-search-scraper/run-sync-get-dataset-items?token=" + config.API_TOKEN + "&format=json";

// Main function to initiate the index check
function indexchecker() {
  const ss = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
  const indexRange = ss.getRange('B:B').getValues();
  const indexLastRow = getLastRowCol(indexRange) + 1;
  const lastRowDiff = ss.getLastRow() - indexLastRow + 1;
  const urlRange = ss.getRange(indexLastRow, 1, lastRowDiff).getValues();
  checkAllUrls(ss, urlRange, indexLastRow);
}

// Get the last row of a single column
function getLastRowCol(range) {
  let rowNum = 0;
  let blank = false;
  for (let row = 0; row < range.length; row++) {
    if (range[row][0] === "" && !blank) {
      rowNum = row;
      blank = true;
    } else if (range[row][0] !== "") {
      blank = false;
    }
  }
  return rowNum;
}

// Check all URLs for their index status in batches
function checkAllUrls(ss, urlRange, indexLastRow) {

  // Batch size currently set to 10. Try to increase this if you want to test more URLs at once.
  const batchSize = 10;

  for (let i = 0; i < urlRange.length; i += batchSize) {
    let batchUrls = urlRange.slice(i, i + batchSize).map(row => row[0] ? `site:${row[0]}` : '');

    // Filter out empty URLs before making the API call
    const nonEmptyUrls = batchUrls.filter(url => url !== '');

    updateStatus(ss, "Loading...");

    let batchResults = {};
    if (nonEmptyUrls.length > 0) {
      try {
        batchResults = fetchSerps(nonEmptyUrls);
      } catch (error) {
        updateStatus(ss, "Error occurred while fetching data. Please check your setup.");
        return;
      }
    }

    batchUrls.forEach((url, j) => {
      if (url !== '') {
        const site = url.replace('site:', '');
        const check = batchResults[site];
        ss.getRange(indexLastRow + i + j, 2).setValue(check);
      } else {
        // If the URL is empty, set the status as empty
        ss.getRange(indexLastRow + i + j, 2).setValue('');
      }
    });

    updateStatus(ss, ((i + batchSize) <= urlRange.length ? (i + batchSize) : urlRange.length) + '/' + urlRange.length + ' URLs checked');
  }
}

// Update status on spreadsheet
function updateStatus(ss, status) {
  ss.getRange(1, 3).setValue(status);
  SpreadsheetApp.flush();
}

// Fetch SERPs from Apify API
function fetchSerps(queries) {
  const inputPayload = {
    "queries": queries.join("\n"),
    "mobileResults": false,
    "csvFriendlyOutput": false,
    "languageCode": "",
    "maxConcurrency": 10,
    "saveHtml": false,
    "saveHtmlToKeyValueStore": false,
    "includeUnfilteredResults": false
  }
  const options = {
    'method': 'POST',
    'contentType': 'application/json',
    'headers': {
      'Content-Type': 'application/json'
    },
    'payload': JSON.stringify(inputPayload)
  };

  let response;
  try {
    response = JSON.parse(UrlFetchApp.fetch(API_URL, options));
  } catch (error) {
    throw new Error('API call failed');
  }

  const results = {};
  response.forEach(result => {
    const site = result.searchQuery.term.replace('site:', '');
    results[site] = result.resultsTotal === 0 ? 'Not indexed' : 'Indexed';
  });

  return results;
}

Thanks for stopping by 👋

I’m Andrew Charlton, the Google Sheets nerd behind Keywords in Sheets. 🤓

Questions? Get in touch with me on social or comment below 👇


More Scripts

Submit your response

Your email address will not be published. Required fields are marked *

2 Responses on this post

  1. Andrew, amazing solution. I’ve only one question: does this work well with the captcha at Google? I’ve tried running an indexer checker manually from Google Sheets and it reported almost always an error because of the captcha.

    1. No issue with captcha 🙂 The API works fine because it likely uses revolving proxies and captcha management solutions.