How many links does Google crawl on a page?

I’ve been asked this question a few times. How many links does Google crawl on a single page? After doing a bit of research I’ve come up with two conclusions. First from a technical Google Bot Crawl point of view and secondly site usability.

As a general rule of thumb, many SEO practitioners try to keep the URL count on a single page to under 100. This stems from the Google Webmaster Guidelines in which it states “Keep the links on a given page to a reasonable number”.

Matt Cutts once explained that Google used to index on average the first 100k of a page. These days search engines indexes more than 100k but it still raises some points.. If you have over 100 links on a page, how are these links displayed? Is it in a way so that the user can still successfully browse through your site without feeling overwhelmed?

This is where you look at your website from a usability point of view. If it makes sense to have over 100 links then by all means go for it – just as long has you have plenty of quality content to go with it and that the amount of links will not confuse or put visitors off your website. Google will not automatically penalise you for having over 100 links just as long as the contents and links follow their quality guidelines.

Royal Wedding Bonanza

Kate Bevan, a Kate Middleton lookalike

Kate Bevan, a Kate Middleton lookalike.

Last year a friend of mine, Kate Bevan gave up her job to become a full time Kate Middleton Lookalike. Ever since the change of career she has been bombarded with photo shoot opportunities and TV appearances.

Eventually Kate wanted a website to publicise her portfolio and expander her audience worldwide. She issued this task with my personal friend Jabin King who designed and hosted her new website http://www.katemiddleton-lookalike.com/.

The website has generated a lot of interest and some amazing “Kate Middleton Lookalike” PR opportunities such as the launch of the new Alison Jackson book and even flying all the way to Hong Kong!

Big rep to Jabin for the site design. Jabin is available for website deign and consultation. More info here.

Facebook launch Open Compute project

Facebook has launched an initiative to share its Data Centre designs to the public, taking their commitment to the openness a whole new level. The new plans share details on both server and data centre infrastructure designs that emphasise a lower energy consumption and operation costs.

Chassis

The Open Compute Project chassis is designed to accommodate the other components in a server, including the custom motherboard and power supply. Overall it is vanity free, has no sharp corners and is designed for easy servicing.

It is completely screw-less, uses quick release components such that the motherboard snaps into place with a series of mounting holes, and the hard drives use snap-in rails to slide into the drive bay.

Motherboards

The AMD motherboard is a dual AMD Opteron® 6100 Series socket motherboard with 24 DIMM slots. It is a power-optimized, barebones motherboard designed to provide the lowest capital and operating costs. Many features found in traditional motherboards have been removed from the design.

The Intel motherboard is a dual Intel Xeon® 5500 or Intel Xeon® 5600 socket motherboard with 18 DIMM slots. Like the AMD board, it is a power-optimized, barebones motherboard designed to provide the lowest capital and operating costs. Many features found in traditional motherboards have been removed from the design.

Power Supply

The Open Compute Project 450W power supply is an AC/DC power converter, single voltage 12.5VDC, closed frame, self-cooled power supply used in high-efficiency applications. The power converter includes independent AC input and DC output connectors, plus a DC input connector for backup voltage. Current sharing and parallel operations are not required, while the main focus is a design with very high electrical efficiency.

This is an interesting subject and i will keep an eye on it. What will the future hold for this kind of initiative? Will it stick? Only time will tell.

How often does Google download robots.txt

Earlier this week i had an issue with Google caching an old copy of a robots.txt file. This caused problems within webmaster tools Google wouldn’t index any of the new pages previously blocked by the robots.txt file.

After doing a bit of research i worked out that Google downloads and caches a copy of your robots.txt for record. Looking further in to this, i worked out that Google typically re downloads the robots.txt file every 24  hours or after 100 visits (whatever comes first).

But what if you have made changes to your robots.txt file want Google to cache the updated version sooner? Futher to my research i came across Header Cache-Control in .htacess. Associating an expiration date with the txt documents on the server forces Google to download a new copy as the old one has expired. I used the following statement in my .htacess file.

<FilesMatch "\.(txt)$">
Header set Cache-Control "max-age=60, public, must-revalidate"
</FilesMatch>

The first line selects what file type you want to target within this function.

The second line declares the function is Cache-Control. The max-age variable is handled in seconds. Meaning that all .txt files expire after 60 seconds and require the user to re download the file.

Now depending on how often Google crawls your site, it will come across an expired robots.txt file and be forced to download a fresh copy. Hopefully the fresh robots.txt file will be cached with Google and start following the rules set within it. This evidentially worked when i checked my Webmaster tools and saw that the new robots.txt file was cached – 15 mins after making the header changes.

This may not be the most efficient method of notifying search engines about an updated file but atleast it did the job. I will continue researching this subject and update this post in a few weeks time.

Basic SEO tips to help your website rank

For a successful website, quality traffic is the key to its success. Here are some basic SEO tips to consider when building or updating your website.

1. Consider your page title
Your page title is very important tactic in SEO when determining your position within search engines. Make sure the title is relevant to the page content and includes the keywords you are trying to emphasise and ensure that each page title is unique.

2. Quality content is king
Always try to make unique content on the subject your website is covering. Good quality content is more likely to be passed around and linked if it was informative to the reader. Remember, Google will reward sites that continually provide useful content to its readers. The main goal for any search engine is to provide search results to relevant and useful content.

3. Avoid duplicate content
Search engines will penalise duplicate content. One major offender for this is how website addresses are handled as default. For example, as default your website can be accessed via www.domain.com, domain.com, www.domain.com/index.html and domain.com/index.html.

Search engines will class this as four separate pages even though its actually the same page being displayed. A way round this is to include a rewrite rule in your .htaccess file forcing search engines and browsers to be accessable by one single domain (EG: http://www.domain.com/)

RewriteCond %{HTTP_HOST} ^yourdomain.tld$ [NC]
RewriteRule ^(.*)$ http://www.yourdomain.tld/$1 [R=301,L]
 
RewriteCond %{THE_REQUEST} /index\.php\ HTTP/
RewriteRule ^index\.php$ / [R=301,L]

For more information on .htaccess url rewriting i recommend you read this article.

4. Title and ALT tags in links and images
Include title tags in your internal links to emphasise its content. Include your targeted keywords and a useful sentence to give your some added emphasis in your SEO results. Example:

<a title="Funny cat pictures" href="http://www.catpictures.com/funny_cats.html">Funny cat pictures</a>

Its also useful to include keywords in image alt and title tags. This is helpful when trying to gain better results on Google Image searches. It is also useful for visitors that have images disabled. The alt tag will explain what the picture is about. Example:

<img class="alignnone" title="My new iPhone arrived today" src="/imgs/my-new-iphone.jpg" alt="My new iPhone arrived today" />

My new marketing & SEO blog

Finally got round to creating my blog after months and months of putting it off. The main reason for this blog was to allow me to express my views and thoughts on the latest trends and topics in the SEO and Web Marketing industry.

Eventually i’ll update this post.. but now.. its food time!