GetDotted Domains

Viewing Thread:
"Use a sitemap to tell Search Engines about your web pages!"

The "Freeola Customer Forum" forum, which includes Retro Game Reviews, has been archived and is now read-only. You cannot post here or create a new thread or review on this forum.

Sun 02/09/07 at 23:10
Regular
"It goes so quickly"
Posts: 4,083
Use a sitemap to tell Search Engines about your web pages!

As a Freeola customer making use of your Freeola Web Space, it’s only natural that you’ll want as many people as possible to visit your web site, and statistically speaking, most visitors arrive from search engine result pages (SERPS for short), so having those search engines know about all your web pages is in your benefit.

A sitemap is often though of as a page on a web site that literally lists all the other available pages on that particular site, but more recently, a sitemap is also an XML text file that helps search engines crawl those available pages in a more efficient manor.

The sitemap protocol is currently supported by it’s inventor, Google, as well as Microsoft (for Bing), Yahoo and Ask Jeeves.

This article assumes the reader is aware of HTML coding, and is able to pick up on the XML type of mark-up quickly.

A sitemap isn’t just for listing all my web pages?

While many assume the sitemap protocol is purely to list all your web pages for search engines to find, the main benefits of the sitemap is actual to assist the search engine in a more meaningful way.

Search Engines have been around for years, and are more than capable of sniffing out any web pages that exist on the Internet. This is mostly done from links on your home page, that link to other pages on your web site, which may link to more on your web site, and so on. As your web site will more than likely include a link in one form or another to every other page on your web site, using the sitemap protocol to simply re-list these web addresses will unlikely make any difference in the way your site is crawled.

This is because the search engine will likely have already found and made a record of all your web pages when it first found your web site, either after you submitted it, or someone, somewhere, linked to it via their own site.

The main point of the sitemap protocol is to provide search engines with a little extra information, or “meta data” about your web sites pages, before it comes crawling. Search engines may use this additional information in a variety of ways, but the main aim is to more efficiently and effectively crawl web sites on the web.

A web crawler, in its basic form from all search engine companies, comes along to your web site, scans it, and then lists it. But with literally millions upon millions of web pages existing, this can be a hell of a task, even for today’s super computers, so to cut down on the workload, and to protect web site bandwidth, the search engine web crawlers may not scan over your entire web site in on go.

This benefits everyone all around in many respects, which include a search engine being able to crawl and list a larger amount of web sites in a shorter amount of time, and if your web host caps the bandwidth you can use, then you won’t want too much of it being eaten up by web crawlers.

However, there is one downside, and quite a major one to you, which is that your web site listings may not be up to date, or may not be listed at all, because the web crawlers only scanned 20 of your 65 pages, and may not come back for a month or two.

So, how do I make my site more crawl-able?

This is where the sitemap protocol can come in handy, for you and search engines alike, as it enables to you give details about your web pages that the search engines can use before sending out the crawlers. At the time of writing, the additional information that a sitemap can provide for each web address is:

1. The date the web page was last updated (modified).
2. How often you envision the web page will be updated.
3. How you rate the web page compared to others on your web site.

Generating your Sitemap file!

A sitemap file is written in XML mark-up, which is similar to the way that HTML is laid out. Creating a sitemap can be done by hand, typed out in Notepad, or it can be automatically generated. While typing out a full sitemap sounds dull and tiresome, an automatically generated one can be inaccurate, so a mixture of both it probably the best method.

There are some web sites that will create a sitemap for you, such as www.sitemapdoc.com or www.xml-sitemaps.com, but as mentioned above, these may not be as accurate as they could be, especially in terms of the change frequency and priority settings, as it’s difficult to automatically detect which pages you (the webmaster) rate above others, or how often you plan on changing them.

Using the above linked services can cut down on the time it takes to initially put your sitemap together though, especially if you have larger number of pages on your site. Simply click to one of the sitemap generators above, add your web address and download the sitemap provided, then open it up and alter the settings as you see fit.

Sitemap

A sitemap example file looks like this:

<?xml version="1.0" encoding="UTF-8"?>

<urlset xmlns=" http://www.sitemaps.org/schemas/sitemap/0.9">
[B]<url>
<loc>http://www.example.com/</loc>
<lastmod>2007-09-02</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>[/B]
<url>
<loc>http://www.example.com/aboutme.htm</loc>
<lastmod>2007-09-02T20:38:54+00:00</lastmod>
</url>
<url>
<loc>http://www.example.com/contact.htm</loc>
</url>
</urlset>

Using one of the online generators linked above would create something similar, but based on your own web site. Once download, you could then manually alter some of the settings to better suite your own web site. If you only have a small site, you may just want to copy and paste the example above, and start from scratch.

Each set of information for each one of your web pages needs to be enclosed within the [B]<url>[/B] ... [B]</url>[/B] XML elements, with each individual value being enclosed within an element of it’s own.

While the focus of this article is to include all the available information, the [B]<lastmod>[/B], [B]<changefreq>[/B], and [B]<priority>[/B] elements are optional, only the [B]<loc>[/B] element is required for each block of [B]<url>[/B] entries you make. If you automatically generate your sitemap file, but don’t want to manually alter all your web pages listings, feel free to remove the less important ones, such as the second and third blocks in the example above shows.

Web addresses that contain the [B]&[/B] symbol need this symbol to be written as [B]&amp;[/B], for example, the web address
http://www.example.com/index.php?page=2[B]&[/B]size=5
would need to be written as
http://www.example.com/index.php?page=2[B]&amp;[/B]size=5

As you can see, there isn’t really a lot to it.

1: Why do accurate modified dates matter?

Indicating when a particular web page was last modified accurately enables the search engine to skip pages that it has already crawled, saving your web site bandwidth and web server processing. While Freeola doesn’t set any limits on how much bandwidth your site can use explicitly, if you have a large site, and many visitors, you don’t really want search engines taking up any bandwidth or server processing if it already had the information about that page.

Search engines may also not crawl you entire web site at once, so enabling it to skip pages it has already means your web site search results will be more up to date. If for example a web crawler comes along with a limit of 10 web pages from your site that it’s going to scan, and it turns out that 8 out of those 10 are already listed and up to date, then those 8 crawler scans are wasted, whereas if it already knows which pages haven’t been changed since it last came crawling, it’ll use that limit of 10 pages to scan newer content from your web site.

If your sitemap lists that 12 are out of date, the crawler may even up its limit of 10 in this instance to ensure your site listings are fully up to date.

The format for this setting is [B]year-month-day[/B], in the form of [B]YYYY-MM-DD[/B], for example, today would be written as [B]2007-09-02[/B]. If you wish to add the time as well, you can do so by appending the capital letter [B]T[/B], following by the time format in [B]hours:minutes:seconds+TZ*[/B], in the form of [B]HH:MM:SS+HH:MM[/B], for example, today at 8:15pm would be written as [B]2007-09-02T20:15:00+00:00[/B]

*TZ, Timezone relative to the UTC format.

If you were using a timezone different to UTC, then you would need to indicate the number of hours ahead / behind that you were, for example, [B]2007-09-02T20:15:00+03:00[/B] or [B]2007-09-02T20:15:00-05:00[/B].

The value for this setting is enclosed within the [B]<lastmod>[/B] element.

2: Why does an accurate change frequency matter?

Including how often a web page is likely to be changed can enable the search engine to pop along to your site to scan a single page more often then it may do others. If your sitemap indicates that your home page is updated daily, but the rest of your site monthly, you may find it comes along more often to keep that page up to date in it’s search results, while not necessarily scanning any other pages at the time.

The available values for this setting are either [B]always[/B], [B]hourly[/B], [B]daily[/B], [B]weekly[/B], [B]monthly[/B], [B]yearly[/B] or [B]never[/B].

The value for this setting is enclosed within the [B]<changefreq>[/B] element.

3: What does the Priority setting indicate?

The priority setting can be used to tell the search engine which pages you feel are more important or useful than others on your web site, which can be helpful if after scanning your pages, the search engine finds it can’t really decide which is better.

A good example for this could be Freeola’s web forums, as you have three layers to get to the threads, which are:

1. The Freeola chat home page, which lists the available forums.
2. The listings of the threads within each forum.
3. The individual threads.

Now, you may consider the priority goes in that order - the main page to the forums, the forums with the threads, and then the thread with the content, but you would more than likely benefit from reversing that theory, because the thread with the content will contain the most likely information a user is looking for when searching for something.

If someone had clicked a link from a search engine results page in to Freeola, and was presented with a list of threads, they may not wish to take the time looking through the long list and click back, whereas if the link takes them directly to a thread that is relevant to what they were looking for, it’s more than likely they will not feel the need to click back and move on to the next web site.

Using this type of thinking, you can advise a search engine which pages you feel are the more useful / meaningful of your web site, so that if after scanning your pages, it feels two or more are the same, you can use the priority setting to hint at which you feel are the more important.

Naturally of course, this goes out the window if you mark them all as being the exact same level of importance.

The available values for this setting range from [B]0.0[/B] up to [B]1.0[/B], with the default being in the middle, [B]0.5[/B].

The value for this setting is enclosed within the [B]<priority>[/B] element.

Sitemap Index

A sitemap index example file is used if you feel the need to create multiple sitemap files for your web site, to either make maintenance easier, or if you have more than the 50,000 URL limit of a single sitemap file, and looks like this:

<?xml version="1.0" encoding="UTF-8"?>

<sitemapindex xmlns=" http://www.sitemaps.org/schemas/sitemap/0.9">
[B]<sitemap>
<loc>http://www.example.com/sitemap1.xml</loc>
<lastmod>2007-09-02T20:35:15+00:00</lastmod>
</sitemap>[/B]
<sitemap>
<loc>http://www.example.com/sitemap2.xml</loc>
<lastmod>2007-09-01</lastmod>
</sitemap>
</sitemapindex>

Again, the [B]<lastmod>[/B] element is optional, but you'll need a [B]<loc>[/B] for each sitemap you have for your web site.

Telling the search engines about your sitemap!

After all the work you’ve put in to creating your sitemap, it would be nice if search engines used it, wouldn’t it? Well, unfortunately you can’t, because this article about sitemaps is completely fake, and sitemaps don’t really exist!

Just kidding, there are a number of methods to tell the search engines about your sitemap, with the most hassle-free being to use the robots.txt file in your root directory, and adding the following line to it:

Sitemap: http://yourwebsite.com/sitemap.xml

... or ...

Sitemap: http://yourwebsite.com/sitemap_index.xml

If you don’t have a robots.txt, this is simply a bog-standard text file that you can create in Windows Notepad (or equivalent), add the above line (though with your web address in place), and uploading it to your web space’s root directly, so that it is accessible via http://yourwebsite.com/robots.txt and let the search engines do the rest.

If you’re a little bit impatient though, you can tell some search engines that your sitemap now exists by “pinging” it to them via your web browser. The following web addresses should do this for you, but you’ll need to replace the bold text URL with your own web site address and sitemap location (for example: http://yourwebsite.com/sitemap.xml):

Ask:
http://submissions.ask.com/ping?sitemap=URL
Google:
http://www.google.com/webmasters/tools/ping?sitemap=URL
Yahoo:
http://search.yahooapis.com/SiteExplorerService/V1/ping?sitemap=URL
Bing: (formally Live Search)
http://www.bing.com/webmaster/ping.aspx?sitemap=URL

You can also submit your sitemap to Google, Yahoo or Bing via their Webmaster Tools, Site Explorer and Bing Webmaster Center facilities, if you have an account with any of them. Simply login and add them.

To sum it all up!

If you go down the route of creating a sitemap file following the sitemap protocol, you should do your best to make sure the information about each web address listed within it is as specific and as accurate as possible, otherwise you’re losing the real benefits the protocol provides you with, which is talking to the search engine crawlers that wonder around the world wide web.

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

As always, any comments, questions, and especially corrections are welcome.
Sun 18/09/11 at 11:00
Regular
"Feather edged ..."
Posts: 8,536
Hmmm... wrote:
Hi DL,
"All sorted" ?
I read the 'post before mine' and the poster seems to be asking for help???

*having a confusing start to the day* ;¬)


Hi Hmmm, think I am now ha ha :¬D
The user 'appears' to have said 'thank you' for the advice regarding the deletion of the two folders he'd created and for the sitemap files to be placed directly within HTDOCS. Got me thinking now ... not very clear I'd admit :¬)
Sun 18/09/11 at 09:13
Moderator
"Are you sure?"
Posts: 5,000
Dragonlance wrote:
... All sorted now :¬)
...
See post before yours....



Hi DL,
"All sorted" ?
I read the 'post before mine' and the poster seems to be asking for help???

*having a confusing start to the day* ;¬)
[s]Hmmm...[/s]
Sat 17/09/11 at 19:16
Regular
"Feather edged ..."
Posts: 8,536
Hmmm... wrote:
I'm not a WP user - but always happy to try and help...

Hi Hmmm,

This is the salient bit:

"I am using XML sitemap generator 3.2.5 and have created 2 new folders in HT docs to accept the generated code (sitemaps.xml and sitemaps.xml.gz). Can anyone help in response please?"

All sorted now :¬)

See post before yours....
Sat 17/09/11 at 14:55
Moderator
"Are you sure?"
Posts: 5,000
I'm not a WP user - but always happy to try and help...

Are you getting any error or warning message displayed when trying to use your WP plugin?

Can you link to the plugin?

Do you need to change the 'permissions' (CHMOD) the folders you've created so the plugin can access them?

[s]Hmmm...[/s]
Sat 17/09/11 at 11:44
Regular
Posts: 4
PJH8RN wrote:
Hi,

I read your article with interest on Sitemaps and decided to use a Wordpress plugin to generate the maps for my site. Unfortunately the plug cannot generate the maps, I am using XML sitemap generator 3.2.5 and have created 2 new folders in HT docs to accept the generated code (sitemaps.xml and sitemaps.xml.gz). Can anyone help in response please?


Thank you
Mon 12/09/11 at 19:40
Regular
"Feather edged ..."
Posts: 8,536
Don't create a new folder within HTDOCS. Delete your two new folders and put the file 'sitemaps.xml etc' in the HTDOCS folder as you have with most of your 'site' files including your 'index' file :¬)


EDIT: And don't hassle the staff ... you do get an awful lot more here 'free' , you just have to wait awhile :¬)
Mon 12/09/11 at 17:30
Regular
Posts: 4
Hi,

I read your article with interest on Sitemaps and decided to use a Wordpress plugin to generate the maps for my site. Unfortunately the plug cannot generate the maps, I am using XML sitemap generator 3.2.5 and have created 2 new folders in HT docs to accept the generated code (sitemaps.xml and sitemaps.xml.gz). Can anyone help in response please?
Thu 21/01/10 at 10:19
Moderator
"Are you sure?"
Posts: 5,000
hilly wrote:
> I am a Freeola customer and was looking to add a sitemap to my
> domain for SEO and came along a snag. As I do not have Freeola as
> my ISP nor have purchased Web Freedom can I not upload using FTP
> and as a result my sitemap.xml in anyway at all?

Hi and welcome to the forums!

How do you normally FTP your data or is your site pure CMS?
If you are only using a CMS then how did you install those files originally? ;¬)

You could use Freeola dial-up - just the cost of an 0845 phone call (no sign up required, etc.) to FTP your sitemap.
[s]Hmmm...[/s]
Wed 20/01/10 at 23:54
Regular
Posts: 1
I am a Freeola customer and was looking to add a sitemap to my domain for SEO and came along a snag. As I do not have Freeola as my ISP nor have purchased Web Freedom can I not upload using FTP and as a result my sitemap.xml in anyway at all?
Sat 15/08/09 at 11:12
Regular
"It goes so quickly"
Posts: 4,083
Updated the ping list to include the the new URL for Bing, which is the new name for Live Search.

Freeola & GetDotted are rated 5 Stars

Check out some of our customer reviews below:

Continue this excellent work...
Brilliant! As usual the careful and intuitive production that Freeola puts into everything it sets out to do, I am delighted.
Second to none...
So far the services you provide are second to none. Keep up the good work.
Andy

View More Reviews

Need some help? Give us a call on 01376 55 60 60

Go to Support Centre
Feedback Close Feedback

It appears you are using an old browser, as such, some parts of the Freeola and Getdotted site will not work as intended. Using the latest version of your browser, or another browser such as Google Chrome, Mozilla Firefox, or Opera will provide a better, safer browsing experience for you.