Improve the structure of your URLs

Improve the structure of your URLs


Creating descriptive categories and filenames for the documents on your website can not only help you keep your site better organized, but it could also lead to better crawling of your documents by search engines. Also, it can create easier, “friendlier” URLs for those that want to link to your content.Visitors may be intimidated by extremely long and cryptic URLs that contain few recognizable words.

Some users might link to your page using the URL of that page as the anchor text. If your URL contains relevant words, this provides users and search engines with more information about the page than an ID or oddly named parameter would.

Lastly, remember that the URL to a document is displayed as part of a search result in Google, below the document’s title and snippet. Like the title and snippet, words in the URL on the search result appear in bold if they appear in the user’s query.

Google is good at crawling all types of URL structures, even if they’re quite complex, but spending the time to make your URLs as simple as possible for both users and search engines can help. Some webmasters try to achieve this by rewriting their dynamic URLs to static ones; while Google is fine
with this, we’d like to note that this is an advanced procedure and if done incorrectly, could cause crawling issues with your site.

URL structure

A site’s URL structure should be as simple as possible. Consider organizing your content so that URLs are constructed logically and in a manner that is most intelligible to humans (when possible, readable words rather than long ID numbers). For example, if you’re searching for information about aviation, a URL like http://en.wikipedia.org/wiki/Aviation will help you decide whether to click that link. A URL like http://www.example.com/index.php?id_sezione=360&sid=3a5ebc944f41daa6f849f730f1, is much less appealing to users.

Consider using punctuation in your URLs. The URL http://www.example.com/green-dress.html is much more useful to us than http://www.example.com/greendress.html. We recommend that you use hyphens (-) instead of underscores (_) in your URLs.

Ex.

http://shopping.thankgoditsholiday.com/asin_B000UX9YJ0_0_0_0_Garmin-n%C3%BCvi-760-4.3-Inch-Widescreen-Bluetooth-Portable-GPS-Automobile-Navigator.htm

asin_B000UX9YJ0_0_0_0_ < This part not important, Then I use _ as seperator.

Garmin-n%C3%BCvi-760-4.3-Inch-Widescreen-Bluetooth-Portable-GPS-Automobile-Navigator < This part is keywords target, Then I use – as seperator.

Overly complex URLs, especially those containing multiple parameters, can cause a problems for crawlers by creating unnecessarily high numbers of URLs that point to identical or similar content on your site. As a result, Googlebot may consume much more bandwidth than necessary, or may be unable to completely index all the content on your site.

Common causes of this problem
Unnecessarily high numbers of URLs can be caused by a number of issues. These include:

  • Additive filtering of a set of items Many sites provide different views of the same set of items or search results, often allowing the user to filter this set using defined criteria (for example: show me hotels on the beach). When filters can be combined in a additive manner (for example: hotels on the beach and with a fitness center), the number of URLs (views of data) in the sites explodes. Creating a large number of slightly different lists of hotels is redundant, because Googlebot needs to see only a small number of lists from which it can reach the page for each hotel. For example:
    • Hotel properties at “value rates”:
      http://www.example.com/hotel-search-results.jsp?Ne=292&N=461
    • Hotel properties at “value rates” on the beach:
      http://www.example.com/hotel-search-results.jsp?Ne=292&N=461+4294967240
    • Hotel properties at “value rates” on the beach and with a fitness center:
      http://www.example.com/hotel-search-results.jsp?Ne=292&N=461+4294967240+4294967270
  • Dynamic generation of documents. This can result in small changes because of counters, timestamps, or advertisements.
  • Problematic parameters in the URL. Session IDs, for example, can create massive amounts of duplication and a greater number of URLs.
  • Sorting parameters. Some large shopping sites provide multiple ways to sort the same items, resulting in a much greater number of URLs. For example:
    http://www.example.com/results?search_type=search_videos&search_query=tpb&search_sort=relevance
       &search_category=25
  • Irrelevant parameters in the URL, such as referral parameters. For example:
    http://www.example.com/search/noheaders?click=6EE2BF1AF6A3D705D5561B7C3564D9C2&clickPage=
       OPD+Product+Page&cat=79
    http://www.example.com/discuss/showthread.php?referrerid=249406&threadid=535913
    http://www.example.com/products/products.asp?N=200063&Ne=500955&ref=foo%2Cbar&Cn=Accessories.
  • Calendar issues. A dynamically generated calendar might generate links to future and previous dates with no restrictions on start of end dates. For example:
    http://www.example.com/calendar.php?d=13&m=8&y=2011
    http://www.example.com/calendar/cgi?2008&month=jan
  • Broken relative links. Broken relative links can often cause infinite spaces. Frequently, this problem arises because of repeated path elements. For example:
    http://www.example.com/index.shtml/discuss/category/school/061121/html/interview/
      category/health/070223/html/category/business/070302/html/category/community/070413/html/FAQ.htm

Steps to resolve this problem
To avoid potential problems with URL structure, we recommend the following:

  • Consider using a robots.txt file to block Googlebot’s access to problematic URLs. Typically, you should consider blocking dynamic URLs, such as URLs that generate search results, or URLs that can create infinite spaces, such as calendars. Using regular expressions in your robots.txt file can allow you to easily block large numbers of URLs.
  • Wherever possible, avoid the use of session IDs in URLs. Consider using cookies instead.
  • Whenever possible, shorten URLs by trimming unnecessary parameters.
  • If your site has an infinite calendar, add a nofollow attribute to links to dynamically created future calendar pages.
  • Check your site for broken relative links.

Good practices for URL structure

Use words in URLs – URLs with words that are relevant to your site’s content and structure are friendlier for visitors navigating your site. Visitors remember them better and might be more willing to link to them.
Avoid:
• using lengthy URLs with unnecessary parameters and session IDs
• choosing generic page names like “page1.html”
• using excessive keywords like “baseball-cards-baseball-cards-baseballcards.htm”
Create a simple directory structure – Use a directory structure that organizes your content well and is easy for visitors to know where they’re at on your site. Try using your directory structure to indicate the type of content found at that URL.
Avoid:
• having deep nesting of subdirectories like “…/dir1/dir2/dir3/dir4/dir5/dir6/
page.html”
• using directory names that have no relation to the content in them
strong>Good practices for URL structure
Provide one version of a URL to reach a document – To prevent users from linking to oneversion of a URL and others linking to a different version (this could split the reputation of that content between the URLs), focus on using and referring to one URL in the structure and internal linking of your pages. If you do find that people are accessing the same content through multiple URLs, setting up a 301 redirect from non-preferred URLs to the dominant URL is a good solution for this.
Avoid:
• having pages from subdomains and the root directory (e.g. “domain.com/
page.htm” and “sub.domain.com/page.htm”) access the same content
• mixing www. and non-www. versions of URLs in your internal linking structure
• using odd capitalization of URLs (many users expect lower-case URLs and remember them better)

Related posts:

  1. Suggest from google : Make your site easier to navigate
  2. Description meta tags
  3. Follow google new api, ideas for mashup – improve features
  4. Basic SEO from google

Twitter Digg Delicious Stumbleupon Technorati Facebook Email

No comments yet... Be the first to leave a reply!

Leave a Reply