Freeola Internet Get Dotted Domains Blog Guides Chat

Viewing Thread:
'Use a sitemap to tell Search Engines about your web pages!'

Sun 02/09/07 at 23:10:
cjh
Regular
"It goes so quickly"
Posts: 4,083
Use a sitemap to tell Search Engines about your web pages!

As a Freeola customer making use of your Freeola Web Space, itís only natural that youíll want as many people as possible to visit your web site, and statistically speaking, most visitors arrive from search engine result pages (SERPS for short), so having those search engines know about all your web pages is in your benefit.

A sitemap is often though of as a page on a web site that literally lists all the other available pages on that particular site, but more recently, a sitemap is also an XML text file that helps search engines crawl those available pages in a more efficient manor.

The sitemap protocol is currently supported by itís inventor, Google, as well as Microsoft (for Bing), Yahoo and Ask Jeeves.

This article assumes the reader is aware of HTML coding, and is able to pick up on the XML type of mark-up quickly.

A sitemap isnít just for listing all my web pages?

While many assume the sitemap protocol is purely to list all your web pages for search engines to find, the main benefits of the sitemap is actual to assist the search engine in a more meaningful way.

Search Engines have been around for years, and are more than capable of sniffing out any web pages that exist on the Internet. This is mostly done from links on your home page, that link to other pages on your web site, which may link to more on your web site, and so on. As your web site will more than likely include a link in one form or another to every other page on your web site, using the sitemap protocol to simply re-list these web addresses will unlikely make any difference in the way your site is crawled.

This is because the search engine will likely have already found and made a record of all your web pages when it first found your web site, either after you submitted it, or someone, somewhere, linked to it via their own site.

The main point of the sitemap protocol is to provide search engines with a little extra information, or ďmeta dataĒ about your web sites pages, before it comes crawling. Search engines may use this additional information in a variety of ways, but the main aim is to more efficiently and effectively crawl web sites on the web.

A web crawler, in its basic form from all search engine companies, comes along to your web site, scans it, and then lists it. But with literally millions upon millions of web pages existing, this can be a hell of a task, even for todayís super computers, so to cut down on the workload, and to protect web site bandwidth, the search engine web crawlers may not scan over your entire web site in on go.

This benefits everyone all around in many respects, which include a search engine being able to crawl and list a larger amount of web sites in a shorter amount of time, and if your web host caps the bandwidth you can use, then you wonít want too much of it being eaten up by web crawlers.

However, there is one downside, and quite a major one to you, which is that your web site listings may not be up to date, or may not be listed at all, because the web crawlers only scanned 20 of your 65 pages, and may not come back for a month or two.

So, how do I make my site more crawl-able?

This is where the sitemap protocol can come in handy, for you and search engines alike, as it enables to you give details about your web pages that the search engines can use before sending out the crawlers. At the time of writing, the additional information that a sitemap can provide for each web address is:

1. The date the web page was last updated (modified).
2. How often you envision the web page will be updated.
3. How you rate the web page compared to others on your web site.

Generating your Sitemap file!

A sitemap file is written in XML mark-up, which is similar to the way that HTML is laid out. Creating a sitemap can be done by hand, typed out in Notepad, or it can be automatically generated. While typing out a full sitemap sounds dull and tiresome, an automatically generated one can be inaccurate, so a mixture of both it probably the best method.

There are some web sites that will create a sitemap for you, such as www.sitemapdoc.com or www.xml-sitemaps.com, but as mentioned above, these may not be as accurate as they could be, especially in terms of the change frequency and priority settings, as itís difficult to automatically detect which pages you (the webmaster) rate above others, or how often you plan on changing them.

Using the above linked services can cut down on the time it takes to initially put your sitemap together though, especially if you have larger number of pages on your site. Simply click to one of the sitemap generators above, add your web address and download the sitemap provided, then open it up and alter the settings as you see fit.

Sitemap

A sitemap example file looks like this:





http://www.example.com/
2007-09-02
monthly
0.8


http://www.example.com/aboutme.htm
2007-09-02T20:38:54+00:00


http://www.example.com/contact.htm



Using one of the online generators linked above would create something similar, but based on your own web site. Once download, you could then manually alter some of the settings to better suite your own web site. If you only have a small site, you may just want to copy and paste the example above, and start from scratch.

Each set of information for each one of your web pages needs to be enclosed within the ... XML elements, with each individual value being enclosed within an element of itís own.

While the focus of this article is to include all the available information, the , , and elements are optional, only the element is required for each block of entries you make. If you automatically generate your sitemap file, but donít want to manually alter all your web pages listings, feel free to remove the less important ones, such as the second and third blocks in the example above shows.

Web addresses that contain the & symbol need this symbol to be written as &, for example, the web address
http://www.example.com/index.php?page=2&size=5
would need to be written as
http://www.example.com/index.php?page=2&size=5

As you can see, there isnít really a lot to it.

1: Why do accurate modified dates matter?

Indicating when a particular web page was last modified accurately enables the search engine to skip pages that it has already crawled, saving your web site bandwidth and web server processing. While Freeola doesnít set any limits on how much bandwidth your site can use explicitly, if you have a large site, and many visitors, you donít really want search engines taking up any bandwidth or server processing if it already had the information about that page.

Search engines may also not crawl you entire web site at once, so enabling it to skip pages it has already means your web site search results will be more up to date. If for example a web crawler comes along with a limit of 10 web pages from your site that itís going to scan, and it turns out that 8 out of those 10 are already listed and up to date, then those 8 crawler scans are wasted, whereas if it already knows which pages havenít been changed since it last came crawling, itíll use that limit of 10 pages to scan newer content from your web site.

If your sitemap lists that 12 are out of date, the crawler may even up its limit of 10 in this instance to ensure your site listings are fully up to date.

The format for this setting is year-month-day, in the form of YYYY-MM-DD, for example, today would be written as 2007-09-02. If you wish to add the time as well, you can do so by appending the capital letter T, following by the time format in hours:minutes:seconds+TZ*, in the form of HH:MM:SS+HH:MM, for example, today at 8:15pm would be written as 2007-09-02T20:15:00+00:00

*TZ, Timezone relative to the UTC format.

If you were using a timezone different to UTC, then you would need to indicate the number of hours ahead / behind that you were, for example, 2007-09-02T20:15:00+03:00 or 2007-09-02T20:15:00-05:00.

The value for this setting is enclosed within the element.

2: Why does an accurate change frequency matter?

Including how often a web page is likely to be changed can enable the search engine to pop along to your site to scan a single page more often then it may do others. If your sitemap indicates that your home page is updated daily, but the rest of your site monthly, you may find it comes along more often to keep that page up to date in itís search results, while not necessarily scanning any other pages at the time.

The available values for this setting are either always, hourly, daily, weekly, monthly, yearly or never.

The value for this setting is enclosed within the element.

3: What does the Priority setting indicate?

The priority setting can be used to tell the search engine which pages you feel are more important or useful than others on your web site, which can be helpful if after scanning your pages, the search engine finds it canít really decide which is better.

A good example for this could be Freeolaís web forums, as you have three layers to get to the threads, which are:

1. The Freeola chat home page, which lists the available forums.
2. The listings of the threads within each forum.
3. The individual threads.

Now, you may consider the priority goes in that order - the main page to the forums, the forums with the threads, and then the thread with the content, but you would more than likely benefit from reversing that theory, because the thread with the content will contain the most likely information a user is looking for when searching for something.

If someone had clicked a link from a search engine results page in to Freeola, and was presented with a list of threads, they may not wish to take the time looking through the long list and click back, whereas if the link takes them directly to a thread that is relevant to what they were looking for, itís more than likely they will not feel the need to click back and move on to the next web site.

Using this type of thinking, you can advise a search engine which pages you feel are the more useful / meaningful of your web site, so that if after scanning your pages, it feels two or more are the same, you can use the priority setting to hint at which you feel are the more important.

Naturally of course, this goes out the window if you mark them all as being the exact same level of importance.

The available values for this setting range from 0.0 up to 1.0, with the default being in the middle, 0.5.

The value for this setting is enclosed within the element.

Sitemap Index

A sitemap index example file is used if you feel the need to create multiple sitemap files for your web site, to either make maintenance easier, or if you have more than the 50,000 URL limit of a single sitemap file, and looks like this:





http://www.example.com/sitemap1.xml
2007-09-02T20:35:15+00:00


http://www.example.com/sitemap2.xml
2007-09-01



Again, the element is optional, but you'll need a for each sitemap you have for your web site.

Telling the search engines about your sitemap!

After all the work youíve put in to creating your sitemap, it would be nice if search engines used it, wouldnít it? Well, unfortunately you canít, because this article about sitemaps is completely fake, and sitemaps donít really exist!

Just kidding, there are a number of methods to tell the search engines about your sitemap, with the most hassle-free being to use the robots.txt file in your root directory, and adding the following line to it:

Sitemap: http://yourwebsite.com/sitemap.xml

... or ...

Sitemap: http://yourwebsite.com/sitemap_index.xml

If you donít have a robots.txt, this is simply a bog-standard text file that you can create in Windows Notepad (or equivalent), add the above line (though with your web address in place), and uploading it to your web spaceís root directly, so that it is accessible via http://yourwebsite.com/robots.txt and let the search engines do the rest.

If youíre a little bit impatient though, you can tell some search engines that your sitemap now exists by ďpingingĒ it to them via your web browser. The following web addresses should do this for you, but youíll need to replace the bold text URL with your own web site address and sitemap location (for example: http://yourwebsite.com/sitemap.xml):

Ask:
http://submissions.ask.com/ping?sitemap=URL
Google:
http://www.google.com/webmasters/tools/ping?sitemap=URL
Yahoo:
http://search.yahooapis.com/SiteExplorerService/V1/ping?sitemap=URL
Bing: (formally Live Search)
http://www.bing.com/webmaster/ping.aspx?sitemap=URL

You can also submit your sitemap to Google, Yahoo or Bing via their Webmaster Tools, Site Explorer and Bing Webmaster Center facilities, if you have an account with any of them. Simply login and add them.

To sum it all up!

If you go down the route of creating a sitemap file following the sitemap protocol, you should do your best to make sure the information about each web address listed within it is as specific and as accurate as possible, otherwise youíre losing the real benefits the protocol provides you with, which is talking to the search engine crawlers that wonder around the world wide web.

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

As always, any comments, questions, and especially corrections are welcome.
Sun 18/09/11 at 11:00:
DL
Regular
"Feather edged ..."
Posts: 8,274
Hmmm... wrote:
Hi DL,
"All sorted" ?
I read the 'post before mine' and the poster seems to be asking for help???

*having a confusing start to the day* ;¨)


Hi Hmmm, think I am now ha ha :¨D
The user 'appears' to have said 'thank you' for the advice regarding the deletion of the two folders he'd created and for the sitemap files to be placed directly within HTDOCS. Got me thinking now ... not very clear I'd admit :¨)
Sun 18/09/11 at 09:13:
Moderator
"Are you sure?"
Posts: 4,862
Dragonlance wrote:
... All sorted now :¨)
...
See post before yours....



Hi DL,
"All sorted" ?
I read the 'post before mine' and the poster seems to be asking for help???

*having a confusing start to the day* ;¨)
Hmmm...
Sat 17/09/11 at 19:16:
DL
Regular
"Feather edged ..."
Posts: 8,274
Hmmm... wrote:
I'm not a WP user - but always happy to try and help...

Hi Hmmm,

This is the salient bit:

"I am using XML sitemap generator 3.2.5 and have created 2 new folders in HT docs to accept the generated code (sitemaps.xml and sitemaps.xml.gz). Can anyone help in response please?"

All sorted now :¨)

See post before yours....
Sat 17/09/11 at 14:55:
Moderator
"Are you sure?"
Posts: 4,862
I'm not a WP user - but always happy to try and help...

Are you getting any error or warning message displayed when trying to use your WP plugin?

Can you link to the plugin?

Do you need to change the 'permissions' (CHMOD) the folders you've created so the plugin can access them?

Hmmm...
Sat 17/09/11 at 11:44:
Regular
Posts: 4
PJH8RN wrote:
Hi,

I read your article with interest on Sitemaps and decided to use a Wordpress plugin to generate the maps for my site. Unfortunately the plug cannot generate the maps, I am using XML sitemap generator 3.2.5 and have created 2 new folders in HT docs to accept the generated code (sitemaps.xml and sitemaps.xml.gz). Can anyone help in response please?


Thank you
Mon 12/09/11 at 19:40:
DL
Regular
"Feather edged ..."
Posts: 8,274
Don't create a new folder within HTDOCS. Delete your two new folders and put the file 'sitemaps.xml etc' in the HTDOCS folder as you have with most of your 'site' files including your 'index' file :¨)


EDIT: And don't hassle the staff ... you do get an awful lot more here 'free' , you just have to wait awhile :¨)
Mon 12/09/11 at 17:30:
Regular
Posts: 4
Hi,

I read your article with interest on Sitemaps and decided to use a Wordpress plugin to generate the maps for my site. Unfortunately the plug cannot generate the maps, I am using XML sitemap generator 3.2.5 and have created 2 new folders in HT docs to accept the generated code (sitemaps.xml and sitemaps.xml.gz). Can anyone help in response please?
Thu 21/01/10 at 10:19:
Moderator
"Are you sure?"
Posts: 4,862
hilly wrote:
> I am a Freeola customer and was looking to add a sitemap to my
> domain for SEO and came along a snag. As I do not have Freeola as
> my ISP nor have purchased Web Freedom can I not upload using FTP
> and as a result my sitemap.xml in anyway at all?

Hi and welcome to the forums!

How do you normally FTP your data or is your site pure CMS?
If you are only using a CMS then how did you install those files originally? ;¨)

You could use Freeola dial-up - just the cost of an 0845 phone call (no sign up required, etc.) to FTP your sitemap.
Hmmm...
Wed 20/01/10 at 23:54:
Regular
Posts: 1
I am a Freeola customer and was looking to add a sitemap to my domain for SEO and came along a snag. As I do not have Freeola as my ISP nor have purchased Web Freedom can I not upload using FTP and as a result my sitemap.xml in anyway at all?
Sat 15/08/09 at 11:12:
cjh
Regular
"It goes so quickly"
Posts: 4,083
Updated the ping list to include the the new URL for Bing, which is the new name for Live Search.

Freeola is a UK internet service provider offering the best value and extensive free services. Please compare our domain name registration prices or check out our UK high speed internet access. If you are in business please see examples of our free hosting at Freeola.com/customer-sites.

Safe and Secure Payment