Welcome to the IMTalk - Internet Marketing & SEO Forum.
  • Login:
+ Reply to Thread
Results 1 to 7 of 7

Thread: Robots txt

  1. #1
    sjeeta's Avatar
    sjeeta is offline IM & SEO Small Talker sjeeta will become famous soon enough
    Join Date
    Nov 2012
    Location
    Mauritius
    Posts
    564
    Thanks Given
    218
    Thanked 33 Times in 29 Posts

    Robots txt

    I understand that robots txt are used to inform search engine spiders how to index content.

    I want to know if is it really important to have a robot txt?


    If yes, I need help creating one because I am a layman in this issue.

    For example,

    My site: mysite(dot)com

    links to pages of my site: mysite (dot) com/a.html
    mysite (dot) com/b.html
    mysite(dot)com/c.html
    mysite(dot)com/a/abcd.html
    mysite(dot)com/sitemap.xml

    I want to create robots txt to crawl these link. Can anyone help me with creatng it.

    The meta description of robots txt should be <meta name="robots" content="noindex">

    or

    <meta name="robots" content="noindex,nofollow">

    or any other. Please specify.


    Does Yahoo, Bing and Google have their own user-agent?
    Last edited by Phoenyx; 12-10-2012 at 09:03 AM.

  2. #2
    zibidic is offline IM & SEO Quiet One zibidic is on a distinguished road
    Join Date
    Nov 2012
    Posts
    11
    Thanks Given
    1
    Thanked 0 Times in 0 Posts
    A robots.txt file can make a huge impact on your blogs traffic and search engine rank. This is an SEO optimized WordPress robots.txt file. Keep in mind that if you mess up the robots.txt file by blocking too much, you could lose all of your rank.

    seo robots.txt

    User-agent: *
    # disallow all files in these directories
    Disallow: /cgi-bin/
    Disallow: /z/j/
    Disallow: /z/c/
    Disallow: /stats/
    Disallow: /dh_
    Disallow: /wp-admin/
    Disallow: /wp-includes/
    Disallow: /contact/
    Disallow: /tag/
    Disallow: /wp-content/b
    Disallow: /wp-content/p
    Disallow: /wp-content/themes/askapache/4
    Disallow: /wp-content/themes/askapache/c
    Disallow: /wp-content/themes/askapache/d
    Disallow: /wp-content/themes/askapache/f
    Disallow: /wp-content/themes/askapache/h
    Disallow: /wp-content/themes/askapache/in
    Disallow: /wp-content/themes/askapache/p
    Disallow: /wp-content/themes/askapache/s
    Disallow: /trackback/
    Disallow: /*?*
    Disallow: */trackback/

    User-agent: Googlebot
    # disallow all files ending with these extensions
    Disallow: /*.php$
    Disallow: /*.js$
    Disallow: /*.inc$
    Disallow: /*.css$
    Disallow: /*.gz$
    Disallow: /*.cgi$
    Disallow: /*.wmv$
    Disallow: /*.png$
    Disallow: /*.gif$
    Disallow: /*.jpg$
    Disallow: /*.cgi$
    Disallow: /*.xhtml$
    Disallow: /*.php*
    Disallow: */trackback*
    Disallow: /*?*
    Disallow: /z/
    Disallow: /wp-*
    Allow: /wp-content/uploads/

    # allow google image bot to search all images
    User-agent: Googlebot-Image
    Allow: /*

    # allow adsense bot on entire site
    User-agent: Mediapartners-Google*
    Disallow: /*?*
    Allow: /z/
    Allow: /about/
    Allow: /contact/
    Allow: /wp-content/
    Allow: /tag/
    Allow: /manual/*
    Allow: /docs/*
    Allow: /*.php$
    Allow: /*.js$
    Allow: /*.inc$
    Allow: /*.css$
    Allow: /*.gz$
    Allow: /*.cgi$
    Allow: /*.wmv$
    Allow: /*.cgi$
    Allow: /*.xhtml$
    Allow: /*.php*
    Allow: /*.gif$
    Allow: /*.jpg$
    Allow: /*.png$

    # disallow archiving site
    User-agent: ia_archiver
    Disallow: /

    # disable duggmirror
    User-agent: duggmirror
    Disallow: /
    http: // googleblog.blogspot.com/2007/02/robots-exclusion-protocol.html

  3. #3
    BigDim's Avatar
    BigDim is offline IM & SEO Weak Jaw BigDim is on a distinguished road
    Join Date
    Nov 2012
    Posts
    206
    Thanks Given
    1
    Thanked 11 Times in 9 Posts
    robots.txt file is svery important for a website.The biggets hosting providers like hostgator create robots.txt for your websites and you don't need to do it yourself.

  4. The Following User Says Thank You to BigDim For This Useful Post:


  5. #4
    sjeeta's Avatar
    sjeeta is offline IM & SEO Small Talker sjeeta will become famous soon enough
    Join Date
    Nov 2012
    Location
    Mauritius
    Posts
    564
    Thanks Given
    218
    Thanked 33 Times in 29 Posts
    Quote Originally Posted by zibidic View Post
    A robots.txt file can make a huge impact on your blogs traffic and search engine rank. This is an SEO optimized WordPress robots.txt file. Keep in mind that if you mess up the robots.txt file by blocking too much, you could lose all of your rank.

    seo robots.txt



    http: // googleblog.blogspot.com/2007/02/robots-exclusion-protocol.html


    I do not want the sample you sent me. I need help in creating mine as per the example in my thread description.

  6. #5
    wink0r's Avatar
    wink0r is offline Moderator wink0r is a splendid one to behold wink0r is a splendid one to behold wink0r is a splendid one to behold wink0r is a splendid one to behold wink0r is a splendid one to behold wink0r is a splendid one to behold wink0r is a splendid one to behold
    Join Date
    Oct 2010
    Location
    East Coast, USA
    Posts
    2,183
    Thanks Given
    941
    Thanked 651 Times in 491 Posts
    The Web Robots Pages is the authority on robots text files.

  7. The Following User Says Thank You to wink0r For This Useful Post:


  8. #6
    zibidic is offline IM & SEO Quiet One zibidic is on a distinguished road
    Join Date
    Nov 2012
    Posts
    11
    Thanks Given
    1
    Thanked 0 Times in 0 Posts
    i think you wants like this:
    (copied entire codes to inside your "robots.txt" file) [valid for all kind robots]

    User-agent: *
    Disallow: /*
    Allow: /a.html
    Allow: /b.html
    Allow: /c.html
    Allow: /abcd.html

    Sitemap: mysite(dot)com/sitemap.xml


    Google Says
    Make use of the robots.txt file on your web server. This file tells crawlers which directories can or cannot be crawled. Make sure it's current for your site so that you don't accidentally block the Googlebot crawler.
    header.php meta seo trick

    Place this in your wordpress themes header.php file, if the page is a single, page, or if its the home page then the robots will index and follow links on it. Otherwise search engines will not index the pages but will still follow the links.

    <?php if(is_single() || is_page() || is_home()) { ?>
    <meta name="googlebot" content="index,noarchive,follow,noodp" />
    <meta name="robots" content="all,index,follow" />
    <meta name="msnbot" content="all,index,follow" />
    <?php } else { ?>
    <meta name="googlebot" content="noindex,noarchive,follow,noodp" />
    <meta name="robots" content="noindex,follow" />
    <meta name="msnbot" content="noindex,follow" />
    <?php }?>


    UNTIL THEN, SEE YOU NEXT TIME

  9. #7
    sjeeta's Avatar
    sjeeta is offline IM & SEO Small Talker sjeeta will become famous soon enough
    Join Date
    Nov 2012
    Location
    Mauritius
    Posts
    564
    Thanks Given
    218
    Thanked 33 Times in 29 Posts
    I am not using wordpress. anyway I am thanked to wonk0r and zibidic


 

Similar Threads

  1. robots.txt - max size what Google bot will read.
    By stronywwwlublin in forum Google
    Replies: 1
    Last Post: 02-03-2012, 08:56 PM
  2. How to optimize the Robots.txt File?
    By MattewHayden555 in forum On-Page SEO & Content Creation
    Replies: 3
    Last Post: 01-23-2012, 04:24 AM
  3. What is robots.txt?
    By Garyu73 in forum On-Page SEO & Content Creation
    Replies: 15
    Last Post: 12-01-2011, 05:13 PM
  4. Robots.txt and extensions
    By Hajoless in forum General SEO Talk
    Replies: 1
    Last Post: 08-22-2011, 03:05 PM
  5. Twitter's robots.txt question:
    By Hema in forum Social Networks & Community Websites
    Replies: 0
    Last Post: 12-09-2010, 08:42 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts