Controlling msn bot |
The MSN Search web crawler MSNBot enables website owners to control which pages MSN Search indexes and how often MSNBot accesses your website. You can prevent (the pain in the butt) MSNBot and other standards-compliant crawlers from crawling a Server or collecting information and links from specific pages on your website by using a robots.txt file and/or meta tags. This is a bit popular cause not everybody is that happy with the annoying bot. Note If other sites link to your site, your site's URL and any text you include in HTML anchor tags may still be added to our index. However, your site content is not added to the index. Use the robots.txt file to control access to your website or part of the server . To control how and when your website is crawled, create a robots.txt file in the top-level (root) directory of your website. In the robots.txt file, you can specify which web crawlers to allow or block. Note that while MSNBot complies with the standards for robots.txt, not all web crawlers comply. To conform to the Robots Exclusion Standard, MSNBot searches for robots.txt. When you create the file, make sure that the file is named robots.txt. Crawling and indexing restrictions may not work correctly if you name the file robot.txt. Each time MSNBot crawls your website, it looks in your Web Server's root directory for a robots.txt file. If the file exists, MSNBot checks to see if MSNBot is an allowed user Agent, and if any crawling or indexing restrictions have been set. To set which web crawlers can access your website, use the syntax in the table below for your robots.txt file. MSN Search also includes image searching provided by Picsearch. If you do not want your images indexed, you can block the Picsearch crawler, Psbot, as described in the following table. Each time MSNBot crawls your website, it looks in your web server's root
directory for a robots.txt file. To set which web crawlers can access your website, use the syntax in the
table below for your robots.txt file. Text strings in the robots.txt file are not case-sensitive.
Restrict indexing and link crawling within your website You can block MSNBot from crawling specific file types linked to your website by specifying MSNBot as the user-agent for a Disallow tag that specifies the file types to exclude.
You can allow MSNBot to crawl your website and still restrict access to
specific web pages and documents by using the noindex
and nofollow meta tags within the page
code. If you want to set access and indexing restrictions for your website, replace the user-agent name robotswith msnbotor "*".msnbot in the tag syntax examples below. You can use each tag alone or combine both tags into a single meta tag.
Limit crawl frequency If you occasionally get high traffic from MSNBot, you can specify a crawl
delay parameter in the robots.txt file to specify how often, in seconds, MSNBot
can access your website. User-agent: msnbot Crawl-delay: 120 If you still find that MSNBot is placing too high a load on your web server, contact MSN Search Site Owner Support. When you contact us about an issue, include the following information so that we can help you more quickly:
Highlight anything and click below General searches Wikipedia All the web Open directory Yahoo Dictionaries Webster |
Next > |
---|
Faq and Rules |
Sitemap |
Archive phpnuke |
Colophon |
Web tv |
Seperate stickies and announcements |
CPanel and PHPmyAdmin Links add on |
Admin posts in red |
Snap add on |
PHP 2007 Manual module |
Disipal themes |
Jaded themes |
Clan Themes |
Phpnuke patched |
Sentinel troubleshooting |
Use any of the below images to link to my site.
Opleidingen studeren |
Pc en mobiel |
Bad Credit Mortgages | Myspace Layouts | Debt Consolidation | Remortgages | Bad Credit Remortgage