Question 1

Where should robots.txt be located?

Accepted Answer

At the root of your domain: https://yoursite.com/robots.txt. NOT in a subdirectory. NOT case-sensitive in the filename (Google reads both robots.txt and Robots.txt). For subdomains, each needs its own robots.txt (https://blog.yoursite.com/robots.txt is separate from main site).

Question 2

Does blocking in robots.txt prevent indexing?

Accepted Answer

Mostly yes but not absolutely. Robots.txt prevents CRAWLING (Google won't read the page content). But if Google finds the URL linked from somewhere else, it MAY still appear in search results with limited info. For absolute non-indexing, use 'noindex' meta tag in the HTML.

Question 3

Should I block AI scrapers like GPTBot?

Accepted Answer

Personal choice. Block if: you sell content/courses/articles for a living, your content is your unique competitive advantage, or you simply don't want AI to train on it. Don't block if: you want maximum exposure including AI citation, you publish public knowledge that benefits from broad access, or you want to be referenced by ChatGPT/Claude/Gemini.

Question 4

What's the difference between disallow and noindex?

Accepted Answer

Disallow (in robots.txt): tells crawler not to FETCH the page. The page may still appear in search results based on external link info. Noindex (HTML meta tag): tells crawler to fetch but NOT INDEX. The page won't appear in results at all. For complete privacy: combine both, plus authentication.

Question 5

How do I block all bots from my site?

Accepted Answer

Put this in robots.txt: User-agent: *
Disallow: /
This tells all crawlers not to access any URL. Good for staging/development sites. NEVER use on production - search engines won't find your site, no traffic.

Question 6

Can I block specific bots while allowing others?

Accepted Answer

Yes - use separate User-agent blocks. For example, block ChatGPT but allow Google: User-agent: GPTBot
Disallow: /
User-agent: *
Allow: /
The specific rule wins over the general rule.

Question 7

Does robots.txt help SEO?

Accepted Answer

Indirectly. By blocking duplicate URLs (filtered listings, search results), tracking parameters, and low-value pages, you concentrate crawl budget on important pages. This improves overall SEO. But robots.txt is not a direct ranking factor - it's about crawl efficiency.

Robots.txt Generator

Robots.txt Generator

What is robots.txt?

How to use this tool

robots.txt syntax

Examples

Tips & best practices

Limitations & notes

Frequently Asked Questions