As many of us expected, Google is constantly upping its game in the ‘fight’ to reduce visibility for badly designed sites and those ‘powered’ by spammy links and link networks.
For those impacted by that new war on ‘spam’ the reality of escaping the clutches of the Google penalty is proving to be more difficult than the majority expected. Link removal campaigns and the use of Disavow have done little to improve visibility for those suffering from filtering and while Google’s ‘penalties’ have moved away from whole-of-site hits to more focused impact it still hurts.
In this article I want to cover a new way to detect unnatural links in a more structured, and dare I say more successful, way. There is a lot of advice on how to find unnatural links on the web but I don't think any of it is quite right.
We are a year in now from the original penguin (see what I did there!) update yet there are still 81% of websites not recovered according to Search Engine Round Table's article. There are also a huge number of people under manual penalties that are consistently removing/ disavowing links and reconsidering with Google with no success. This tells us two things:
Before I go into exactly how this new unnatural link detection works I need to talk about the current methods used and why they are not quite right.
The current unnatural links detection methods mostly involve using the following metrics:
The above metrics are the most popular and well-documented techniques to find unnatural links. I completely agree with these metrics and I have written about these techniques in my Penguin Recovery Guide but going by the current statistics of how many websites have actually recovered from the Penguin penalty there is clearly something being missed.
The current methods are good at finding links from websites who have gone overboard on their link building but if there are a lot of branded anchor text links etc that are unnatural these can easily pass through undetected. This got me working hard to find other ways to discover these unnatural links.
After weeks of researching, testing, failing and testing again I have come up with a unnatural links detection method that is both very structured for speed and actually ignores the old over cooked metric methods used currently. This new method focuses on the ‘trust’ of the websites that linking to you.
To understand why you can find unnatural links from just working out the inherent ‘trust’ of a website you first need to understand how the entire web is linking and basically what Google sees when it looks at the overall link architecture of the web.
The entire link graph is quite a complicated space but to bring it down to every basic level, the following points are true.
Two very basic statements that most people will know but there is so much we can learn from this. Most authoritative sites will only link out to extremely relevant sites that are also authorities in the niche. Because of current and old school link building methods, spammy sites link to other spammy sites in the niche.
Looking at it another way, just as there are in the real world there are ‘good neighbourhoods’ and ‘bad neighbourhoods’ and if your links are found within the latter’s ‘cluster’ (see below for an example of what a cluster looks like from CognitiveSEO’s awesome Link Visualiser)
When you think about it, many of the the old link building methods were very automated; creating quantities of links from such things as blog networks, blog commenting, forums and article sites. With the volume of links being created this generated a huge amount of links pointing from several low trust sites to the " money" sites.
If you picture this as a link graph as above you have several low quality, low trust pages linking to several sites that are trying to game the system via these automated methods. This makes it very easy for Google to see that there are several sites gaining a lot of page rank from sites that only have low trust. This is a very easy method of detecting which sites to penalise.
The key is look at the trust of the sites linking to you.
Now that we fully understand why we must look at the trust of sites rather than the over cooked metrics. We can look into how we can exactly do that.
The process really is very simple, thanks to Majestic SEO's link metrics Citation Flow and Trust Flow. This is because Majestic's link metrics are very similar to what Google use to judge links. Again, to keep it very simple Google uses Page Rank and their secretive Trust Rank and Majestic have the equivalent Citation Flow and Trust Flow.
To use Majestic's metrics to find links it is important to briefly understand what each is.
Citation flow is basically the same as Page Rank. This will show how much actual page rank juice a link has and how much power it can pass.
Trust Flow is the actual trust of a link and how much the link is actually trusted based on what links that domain has etc. Because Majestic uses human SEO experts to judge the trust of a site, this is actually a very reliable metric.
When you take it back to Google looking at sites that have a high amount of page rank from low trust sites we can basically do the same using Majestic's metrics. If a link has a high citation flow and a low trust flow then most of the time that link is going to be spammy.
What we can do now is use these two link metrics to create our own trust ratio for a site. This is very simple and is the following:
Trust Flow / Citation Flow = The Trust Ratio
A very simple calculation but very useful to finding unnatural links. To prove this works lets look at an authoritative site and a spammy site to compare difference.
Domain Citation Flow: 65
Domain Trust Flow: 76
Trust Ratio: 76 / 65 = 1.169
Domain Citation Flow: 30
Domain Trust Flow: 6
Trust Ratio: 6 / 30 = 0.2
The two sites selected, one is a reputable, trusted charity and the other is a random blog network I discovered. This is a very extreme comparison but you can clearly see that the higher the trust ratio the less spammy the site is. If the ratio is above 1, the trust flow is actually higher than the citation flow and is a very good indicator of a trustworthy site.
Now we can use this trust ratio to work out what links are spammy on a penalised website.
Before we actually start looking at the trust ratios of the domains pointing to a site. We have to work out exactly how much trust a specific niche has. The reasoning behind this is quite simple. We can easily look at a site in one niche and work out that any domain with a trust ratio lower than 0.3 for example needs removing but this may be completely different in another niche
Some niches are quite trustworthy but there are some niches such as "electronic cigarettes" or "payday loans" that are very spammy and untrustworthy and we must reflect our trust ratio on that.
To find the average trust ratio of a niche we can use a tool such as SEO Quake to export some data from Google's SERPs. I am not going to explain what SEO Quake is or how to install it as it is beyond the scope of this article but you can find installation details etc here.
1. Download and Install SEO Quake for your browser
2. Set it up so that the Google SERPs overlay is installed.
3. In Google's search setting, up the amount of SERPs displayed on one page to 30.
Once SEO Quake is installed and Google is showing 30 results on one page search for your main target term for the niche you are in. For this article I am going to use the term "electronic cigarette". You will see that Google will show the top 30 results of a page and SEO Quake will go and fetch the metrics you have set it to.
All we want SEO Quake to do is to export the top 30 results. You will see a "Show as CSV" button at the top of the SERPs. Click that and it will allow you to copy the results from a text box and paste them into any spreadsheet or text editor such as notepad.
2. Then drag replicates the same formula for each URL so that you know the trust ratio for each URL.
You will see that some of the Trust Ratios will say "#DIV/0". This is simply where the Citation Flow and Trust Flow are both 0 and cannot be divided. Simply replace these figures with 0 so that the next step works correctly.
3. Then work out the average trust ratio of all the URLs at the bottom of the spreadsheet using the formula =AVERAGE(P2:P31), as the diagram shows below.
We now have the average trust ratio of the specific niche we are looking at and have an idea of what trust ratio the sites have that rank well in the niche and can now use this data to compare it to our own sites.
We looked at a penalised client in the e cigarette niche and they had a citation flow of 31 and a trust flow of 10. Doing the calculation this results in a trust ratio of 0.322. We can clearly see that this site does not have enough trust to rank in the industry hence it is penalised for the low trust/ spammy links it has.
Now onto using this data to find the spammy domains pointing to your site.
Now that we know the average trust ratio of the niche, we can create a figure that if any domains linking to the site are lower than, we will remove or disavow as spammy.
This is the most manual part of the process as you may have to play with the ratio slightly. As a rule, I usually half the average ratio and start there. So in this case I would look at removing domains that are lower than 0.352. Let's put it to the test.
1. Using majestic site explorer, put in the domain of the penalised site you want to find spammy links for. Make sure the domain option button is selected and not the URL. We want to look at all of the links to the domain. Then select the "Ref Domains" tab to see all referring domains. Export the data by clicking the "Download CSV" button.
*We are using the technique to look at referring domains and not the individual links for one simple reason. The standard Majestic SEO account only lets you export the first 2500 links of a site but allows you to export 1000 domains. You can capture a much larger portion if not all of the link profile by using this to export the referring domains.
Now we have a spreadsheet with all the referring domains of the site.
1. Next, Create a new column to the right of the "TrustFlow" Column called "Trust Ratio". Enter the calculation to work out the Trust Ratio =Q2/P2.
Now we have the trust ratio of all the domains. At this point we can see how trustworthy or spammy the domains pointing to the site are.
2. Add another column to the right of Trust Ratio called "Spammy". Then in the cell below add the following =IF(R2<0.352,"Yes","No"). The diagram below shows this:
Replicate this formula across all your domains and you will soon have a list that states a yes or no answer on whether they are spammy or not.
3. Now test a lot of the domains to make sure that the results are accurate. You do not want to be removing any trustworthy domains. If there are some trustworthy sites (it is unlikely) then simply raise the trust ratio to tighten the results. Once you are happy you have a list of spammy domains.
4. Add Auto filtering to the spreadsheet (see here on how to add auto filtering) so you can select only the domains listed as spammy.
I stuck with the ratio 0.352 for the e cigarette penalised client and there was only one site I had to label as not spammy (basically a supplier who had used bad link building techniques and was low trust themselves). The rest were absolutely terrible sites and just selecting 5 of them below your can see how effective it is at picking up spammy domains that are pointing to this site.
(elements of the domain removed for confidentiality reasons)
Now that we have a list of spammy domains that have been checked, we can now look into removing them.
If you wish you can go back and find the actual links from each spammy domain so you can contact the sites and request removal. But from the general success rate of being able to contact the webmasters of these domains you can just enter each domain into a text file and submit it to the Google disavow tool. The example below is what you would put into the disavow:
"We have found the following domains that have unnatural links pointing to our site and we wish to have no association with them. domain: drugxxxxhab.us"
Once you have disavowed, you then need to resubmit with Google or wait for the next algorithmic penalty refresh to see if this was enough to revoke the penalty applied.
If it isn't enough, you can go back to your spreadsheet and tighten the ratio. Try a higher trust ratio in the IF statement put into excel to capture more domains, check them and disavow/ remove the links from them also. Continue doing this until you are revoked from your penalty.
Forget over optimised anchor text. Forget whether all your links are from directory sites. Use this method to look at the actual ‘trust’ your backlinks have and use that to find where you shouldn't be linking from.
It is a very specific process but so far in testing it has proved really effective in improving profiles of those hit by any link-based penalty. What we need now is an army of testers to apply the theory across a larger data set; which is where you come in! We would love some of you guys to try this out and let us know your results.
And if you need any help with finding your spammy links via this method, get in touch and we can help and send a list back to you.
Sign up for our monthly newsletter and follow us on social media for the latest news.