Google has recently started taking action against blog networks in an attempt to remove low quality websites from its index. It is estimated that several thousand domains have already been removed from Google’s index and the number is likely to increase further in the forthcoming weeks or months. That means that hundreds of millions of links have been completely devalued affecting the rankings of several websites, directly or indirectly.
Carrying out a thorough backlinks audit for new clients is extremely important to us because it allows us to:
Get a good understanding of the link profile to their site and the quality of the historical backlinks
Work out the chances of losing some link equity in the foreseeable future
Closely monitor link equity loss on a weekly/monthly basis, react quickly and modify our link strategy if necessary
Forecast more accurately on ranking improvements and traffic growth
First and foremost we need to collect as much backlink data as possible. Exporting data from the following sources would make the data-set quite reliable – the more data, the better.
Majestic SEO historic index offers invaluable data about a site’s backlinks and should almost definitely be the primary source of backlinks data.
Download the CSV file containing all Inbound Links.
Go to ‘Under Your site on the web’ -> Links to your site
Find ‘Who links the most’ on the left-hand side and click ‘More>>’
Download all links by clicking on ‘Download more sample links’
Extract All Unique Linking Root Domains
Drop all the identified backlinks URLs into one spreadsheet and keep the unique root domains only. This can be done by applying the following formula on the current set of URLs:
LEFT(A2,FIND(“/”,A2,8)) where A2 is the cell with the original ‘Full-URL’ data. Make sure all URLs include ‘http://’ otherwise use the formula LEFT(A2,FIND(“/”,A2,2)) .
Note: The above formula works fine with http domains but it doesn’t work with https one. A more complete formula written by James Taylor is available here.
Then, all duplicate subdomains should be removed. Highlight the ‘Linking Root Domain’ column and click on ‘Data->Remove Duplicates’. Choose ‘Expand the selection’ and then click on ‘Remove Duplicates…’.
Check only Linking Root Domain column and click ‘Remove Duplicates…’
Eventually, only the unique domains only will remain in the spreadsheet. Unfortunately, not all identified linking root domains will be linking to the client site because the collected data may not be up-to-date. At iCrossing we use a proprietary tool to filter out all those linking root domains in order to improve the quality of our data set even further. Filtering out all dead links will significantly increase the quality of this exercise.
Having identified a large set of unique linking root domains we can now proceed and do the following:
PageRank distribution check
Linking root domains indexation check
Social metrics distribution check
Running NetPeak Checker a set of 100 URLs can be checked in approximately 1 minute. For optimal performance, make sure that the only the following metrics have been checked:
PR Main (Pagerank of main domain)
PR Page (The PageRank of the given URL e.g. subdomain or deep page)
Google Index (this is the number of pages indexed by Google – equivalent to a site: )
Server – > Status Code (returned values include n/f (404), 200, 301, 302, 303 etc)
Load the unique domains URLs previously identified and hit the ‘Start Check’ button.
Using Excel’s filters it is quite easy to detect which linking root domains may harm rankings by looking for those with the following value characteristics:
Status codes 200, 301 or 302 (live domains)
Google Index value = n/f (no pages found in Google’s index)
PR Main OR PR Page with values of 0 and n/f (if conditions 1 and 2 are met this wouldn’t make any great difference)
Note: Because PageRank gets updated quarterly more or less, a linking root domain may have been removed from Google’s index even though it still presents a high PageRank value.
Working out the PageRank distribution of all linking root domains will unveil the proportion of low quality backlinks. In order to do that we need to check Toolbar PageRank and Indexation of all linking root domains.
Open in Excel the .xlsx file exported from NetPeak
Apply a filter in the top row
In the ‘status code’ column filter out all the n/f values (check 301s and 302s but not n/f). This filter will remove all domains that no longer exist.
Create a Pivot Chart and choose:
Row Labels -> PR Main
Values -> Count of PR Main
The PageRank distribution of healthy linking root domains should have less low PageRank values (n/a and 0) backlinks and ideally spike towards the middle of the graph, like this one:
However, a PageRank distribution with too many low PR backlinks, should be a cause for concern:
If this is the case, the link building strategy needs to be adapted accordingly so the website can attract links from authoritative and trusted linking root domains. Where necessary, we may try remove as many low quality backlinks as possible if we think that they may be hurting the website, although sometimes this is out of our control. The main objective in such occasions is to increase the quality of backlinks so the PageRank distribution becomes more balanced.
Monitoring the rate at which linking root domains get removed from Google’s index periodically, can be very useful. If too many linking root domains get deindexed that would be a negative signal, very likely to have a negative impact on rankings.
For instance, in the following example the number of non-indexed linking root domains has significantly increased in three weeks which should trigger some immediate actions.
Checking the deindexation rate periodically (e.g. weekly or monthly) using the same linking root domains data, could identify negative trends in a website’s backlink profile. All deindexed linking root domains should then be checked further in order to identify the reasons that may have led to them being removed from Google’s index. Some manual/editorial checks would make sense in this case. This will also help identifying the best strategy for the short and long term.
Using SEO Tools for Excel, it is fairly easy to calculate the following three metrics for each linking root domain:
Looking at the social shares each linking root domain has received could add some more insight into the above exercise as valuable domains are more likely to be shared socially. On the other hand, domains with very low or no social mentions at all may point to low quality domains.
We believe that moving too slowly in digital is the biggest risk your business faces. If you are ready to move faster in digital, we are here to help.