If you read our previous blog post you will know that we were hit by the latest Google Penguin Update. We suspect it was because we had used blogging networks in the past although on further analysis it looks like it could also relate to our early link building strategies where we focused on backlinking using the same keywords over and over again. Or it could even be the number of keywords we have on those pages.
Honestly, it’s just too difficult to say what the exact reason might be – we’ve tried so many things over the years (before we knew better) and I guess it was going to catch up with us in the end. That’s not to say we did anything black hat, but these days even what was once accepted as white hat is becoming unethical in the eyes of Google.
You’ll see a lot of posts on the net about Penguin and how to get through it but really when it comes down to it nobody really has all the answers because nobody really knows exactly what Google does. I’ve heard so many conflicting theories – some say it’s is all about the backlinks, while others says it’s nothing to do with backlinks and it’s all about the on page SEO. If they had all the answers, they wouldn’t be online writing blog posts about it – they’d be off on their own private island somewhere logging in only to check their bank balance.
We of course don’t have all the answers either. That doesn’t sound very comforting does it but it’s all just part of the wonderful world of internet marketing. However, we can at the very least get our information straight from the horses mouth…Google themselves, instead of relying on theories that may or may not be correct.
Now this is easier said than done of course because Google isn’t particularly forthcoming about what they do. They tend to be very vague about things and we can only get snippets of information from them if at all. But sometimes those little snippets are enough. So below you will only find information based on actual Google data.
But First…Why This Update Didn’t Really Work
I’m all for Google cleaning up their search results and only getting the high quality sites ranking. I really want to see this happen but at this point, with all the Panda and Penguin updates nothing has really changed. Sure some sites have moved up and some down but we still see scraper sites ranked before legitimate sites, those with thin content ranked before quality sites and sites full of spammy ads and nothing much else beating out well developed content sites.
I really don’t think this update did the trick. And as always there are plenty of people who were innocently hit. Google’s goal with this update was to target over-optimization both on page and off. Too many spammy links to your site and you were penalized, too many keywords on the page and you were penalized. Unfortunately however, that is all they looked at. They still didn’t look at the content. Just because a page of content has a lot of keywords and has a lot of spammy links to it, doesn’t mean the content is bad. It might be fantastic content but Google doesn’t know that…their search engine still isn’t sophisticated enough to compare pages for quality and usefulness. So penalizing a page simply for spammy links just doesn’t work in my opinion.
But there’s no point whingeing about it. You could spend hours on forums and blogs commenting on how bad this all is depending on your situation. The only way to get around it is either to comply with Google or to move on to some other form of traffic generation.
Where to Start
In order to clean up our sites to get them ranking again we need to start from the very beginning with Google’s Webmaster Guidelines. The reason we need to start here is that Google explicitly stated that the Penguin change “decrease(s) rankings for sites that we believe are violating Google’s existing ‘quality guidelines”.
So when they refer to those ‘quality guidelines’ they are talking about their Google Webmaster Guidelines. According to Google, these guidelines will help them “find, index and rank your site”. There are three sections to these guidelines and I’ll summarize the main points for each:
1. Design and Content Guidelines
- ensure that each page on your site can be reached from at least one other page
- use a site map to link to important pages
- keep the links on a page to a ‘reasonable number’ – they don’t say what that is
- create a useful, information-rich site
- use words on your pages that people would use to find your site
- use more text links than images for important links or content
- ensure you use descriptive and accurate title tags and ALT attributes
- check for broken links and correct HTML
- for those using dynamic pages (ie. the URL contains a ? character) be aware that not all search engines can crawl these pages.
2. Technical Guidelines
- Google recommend using the Lynx browser to view your site since it will display it the way most search engines see it. If some content is missing then search engines may have trouble viewing it. (I had a quick look at the Lynx site and it looks complicated so haven’t really tried it yet. I did instead find a Lynx viewer where all you need to do is type in your url and it does the same job.)
- Allow search bots to crawl your sites without session IDs or arguments that track their path through the site. (I have no idea what this means but some of you might.)
- Make use of the robots.tx file on your web server. (Most people’s robots.tx file would be fine but Google wants to ensure that you don’t have their search engine blocked.) You can read more about it here – Block or remove pages using robots.txt file and here Robots.txt FAQs.
- Ensure your web server supports the If-Modified-Since HTTP header as this allows Google to tell whether your content has changed since they last crawled the site. (I simply typed in ‘If-Modified-Since HTTP header Hostgator’ as my search query in Google to find out if our hosting company supports this. They do! You can do the same search with your hosting company or simply contact them directly and ask.)
- Ensure that advertisements do not affect search engine rankings. (I’ m not quite sure what Google are getting at here but I think it has something to do with paid links. In other words, if you are selling paid ads on your site then add the rel=’nofollow’ attribute to the links.)
- If you use a content management system ensure that pages can be read by Google.
- Monitor and optimize site performance and load times. (You can read more about this here: Performance Best Practices.
3. Quality Guidelines
- Make pages primarily for users, not for search engines.
- Don’t deceive your users by using cloaking devices.
- Avoid tricks intended to improve search engine rankings.
- Don’t participate in link schemes designed to increase your search engine rankings.
- Avoid linking to web spammers or ‘bad neighborhoods’ as your own ranking may be adversely affected. (Interesting that even linking to a poor quality site could affect your ranking).
- Don’t use unauthorized computer pages to submit pages, check rankings etc. (Not sure how you know what is authorized or not authorized. Google doesn’t elaborate.)
- Avoid hidden text or links.
- Don’t use irrelevant keywords on your pages.
- Don’t create multiple pages, domains or subdomains with duplicate content.
- If you have an affiliate site, make sure your site adds value and provides unique and relevant content.
You can read the full version of the Google Webmaster Guidelines here.
As you can see, Google provides us with a basic overview of what they are after in their Google Webmaster Tools. It’s worth going through each of them to see whether you comply.
In this respect, we are on our own because unfortunately Google doesn’t elaborate too much on anything. We might think we are complying with their guidelines but how do we really know? They say for instance to ‘keep your links on a page to a reasonable number’. What is ‘reasonable’…who knows? You have to really dig around to find the answer which I managed to do. It was in a post by Matt Cutts (Google Engineer) and written in 2009. He mentioned that the page should preferably hold fewer than 100 links. Of course, the post is old so that could all be considered out of date by now and there could be a whole new number of links that we need to have on a page.
The Next Step
Once you have gone through the Webmaster Tools and ensure that you comply to each…as best as you can, the next step is to figure out what this Penguin update was all about. By doing this we can get a better insight into exactly what we need to do about cleaning up our sites. As we mentioned in the opening paragraphs, everyone has their own theory about what happened with this update but we want to hear what Google has to say. Here is what we found:
1. The Official Word
On April 24, 2012 Google published a blog post indicating that an update was imminent and this one was going to “reduce webspam and promote high quality content”. In the post they provided a couple of examples of keyword stuffing, spun content and outgoing links on a page that lead to unrelated content.
Apart from that, they didn’t impart too much information so at this point we were left in the dark. Considering we don’t keyword stuff, use spun content or link out to unrelated pages, this blog post didn’t help in the slightest.
2. Matt Cutts Interview with Search Engine Land
On May 10, 2012 Danny Sullivan from Search Engine Land interviewed Matt Cutts (Google Engineer). In the interview, Matt Cutts said that the Penguin update was designed to be quite precise and act against pages when there was an extremely high confidence of spam being involved. Matt Cutts said, “The bottom line is, try to resolve what you can” and you will know if you have done the right thing the next time Penguin updates.
Again, we don’t really have much to go with here. Just a few snippets of information….just clean up the spam and you may be back ranking again when Penguin updates again….hmm, easy said than done when you don’t really know what the problem is to begin with.
Plus, Matt Cutts flippant statement “that you may need to start all over with a fresh site” if you can’t recover from the Penguin update just shows you how far removed he is from the rest of us. He obviously doesn’t know the amount of work that goes into it all. And what about small business owners who have created a branded website. You can’t tell me that they should start up a whole new website.
3. Google Updates Penguin Again
On May 26, 2012 Matt Cutts announces on his Twitter account that Google has pushed through a second Penguin data refresh. So if you weren’t hit the first time then you could have been hit the second time. Alternatively, if you made positive changes to your site since the first update, you might have seen an increase in traffic.
Does that mean we will see monthly Penguin updates? Hopefully, because it will mean that anyone affected by the Penguin update will be able to make changes to their site and not have to hang out for months waiting for the next update.
4. Matt Cutts Interviewed at SMX (Search Marketing Expo)
On June 5, 2012 Matt Cutts speaks at the SMX conference in Seattle. He informs the audience that Google’s definition of a Google penalty is something that is done manually. In other words, someone manually looks at the site and deems it to be bad. The Penguin update however was not a penalty but an algorithm change which is why you cannot submit your site for reconsideration.
Matt also spoke about negative SEO and that they are considering whether to create a system where you can disavow a link to your site. This would be fantastic but to me that says yes, negative SEO exists. Why would they bother creating a system otherwise? And if you’re wondering what negative SEO is, it is simply a way of killing a competitor by blasting their site with spammy backlinks.
One of the most interesting things to come out of this interview was when he was asked a question about wpmu.org, a reputable site that was hit badly by the Penguin update. The site went to the Sydney Morning Herald (an Australian newspaper) who in turn interviewed Matt Cutts about it.
In the SMX interview, Matt Cutts response was:
“They didn’t rank as high after Penguin, they made their case, and I thought it was a good case. We were able to remove about 500,000 instances of links, and that helped them.”
Now if you look at that response, Matt Cutts is effectively saying that it was their backlinks that caused the problem and by removing 500,000 links their ranking improved. The site had created free WordPress themes and in the footer section had added a link back to their website. This was what resulted in their penalty as a lot of spammy sites used the theme.
Now this is all very nice for wpmu.org who were able to get to the press first to turn their site into a high profile case. But for the rest of us, we are left with spammy links to our site which in most cases are no fault of our own and we are left trying to figure out how to get them removed. Sure, we can email each and every site but do you think a spammy scraper website owner is going to give a toss about removing a link on their site? The whole reason they have a spammy site is that they couldn’t be bothered working on them to begin with. They just want to automate it all and sit back and not touch them ever again.
Another point that Matt Cutts made that would be of interest to us are affiliate links. He did say that they do handle affiliate links okay but if you are in any way worried about them, then add nofollow to them.
And just one more thing which is interesting to note, Matt does say that Google does not look at Google Analytics in its rankings.
That’s about all I have been able to find so far on the Penguin update that comes from straight from a Google representative. If you know of anything else, please let us know in the comments below.
From what I can tell from all of this, the Penguin update focused on two things:
2. On page SEO
Of the two, I personally think that the links are the main factor. In other words, if you have spammy sites linking to you in quantity then you are likely to be affected. If you have used blogging networks or if you have paid someone to get hundreds of thousands of links to your site overnight then you would likely have felt the affects of this update.
I do also suspect that it could be the anchor text used in those links so try to avoid using the same anchor text over and over when backlinking.
Also if you link out from your site to totally unrelated sites then this could also affect you. So if you accept guest articles then ensure the links in those articles are to related sites. We often get sent articles for our sites and the article might be about dogs for instance which would be suitable for our dog site but the links in the article go to a credit card or insurance company. Don’t accept these articles. Keep them related to your site topic.
If you think that spammy backlinks are your problem then go to your Google Webmaster Tools and take the following steps to view your backlinks:
1. Click Traffic from the menu sidebar.
2. Click Links to Your Site from the drop-down that appears.
You can assess each link and if you deem it to be a spam site you can always email them to see if they will remove it.
As for the on page SEO, this has always been a problem for Google but perhaps they are cracking down a little harder. In this case, I would simply check your reviews and articles and if they sound unnatural to you when you read them out aloud then you are probably using too many keywords on the page. We’ve said for a long time now to just write naturally…throw in a few keywords to help Google find your pages but just don’t overdo it.
I think Google have a love-hate relationship when it comes to SEO. They need it because it helps them find relevant pages on the net for their search engine but at the same time it causes them all sorts of grief as webmasters attempt to use every SEO tip and trick in the book to attempt to manipulate their rankings.
What We Are Doing About It
We personally want to stick with Google but at the same time we want to focus on other forms of traffic. This was our goal before we went overseas and is something we are still looking into. But for the moment we want to get our sites back and ranking well in Google. Fortunately we weren’t hit too hard but it was enough to give us a jolt and get us focused again.
We started by doing nothing and you may think that a little odd but we have learned from years of experience that when it comes to Google updates that you don’t make any changes straight away if you have been hit. It’s always best to leave things alone for a while because oftentimes you will find yourself ranking again.
So after waiting a month or so we realized that our traffic wasn’t going to improve so we started to make some slight changes. Nothing major, just some changes to a few pages on our sites. Fortunately we can take our time with this and not make too many drastic changes at once.
We tried those blogging networks at one point but gave up on them pretty quickly because we realized they didn’t work very well. Plus we just didn’t feel comfortable about the sites the articles were going on. They just looked spammy to us and we wanted to be associated with quality wherever possible.
The other problem is spammy sites that link to us and that is out of our control. We noticed that some sites add hundreds of our links to their sites. I’m not sure why…we didn’t ask them to. One would be enough but they link to us from all sorts of pages in their ‘Further Information’ sections. This doesn’t help us if their site is just a scraper site of sorts.
However, we will try to contact the major offenders to see if they will remove the links and see how we go.
We are hoping that Google implements their ‘disavow’ option which will allow us to reject a site that is linking to us. But who knows when this will be – Matt Cutts says ‘maybe a month, or two, or three’. We will see.