Where do I put the robots.txt file?

In your website's root directory. It must be accessible at yoursite.com/robots.txt, not in a subfolder.

Will Google obey my robots.txt?

Major search engines (Google, Bing, DuckDuckGo) respect robots.txt. Some scraper bots ignore it. For sensitive content, use authentication, not just robots.txt.

Should I block AI crawlers like GPTBot?

Your call. Blocking them prevents your content being used to train AI models, but doesn't affect Google search ranking. Many publishers block them; others allow them for visibility.

Crawl-delay tells bots to wait N seconds between requests. Useful if your server is overloaded. Google ignores this; use Search Console crawl rate settings instead.

Robots.txt Generator

From ToolzPedia, the free tools encyclopedia

This is one of several seo tools. For the full list of utilities, see All tools.

SEO Tools · 📈

Category	SEO Tools
Type	Web utility
Format	URL or text input
Privacy	No personal data stored
License	Free of charge
Sign-up	Not required
Status	● Live

A robots.txt file lives at the root of your website and tells search-engine crawlers which pages they can and cannot fetch. It is a critical SEO file, a misconfigured robots.txt can accidentally hide your entire site from Google or, conversely, expose admin sections that should not be indexed.

The ToolzPedia Robots.txt Generator produces a properly-formatted robots.txt file from a checklist: choose which crawlers you want to allow or block, which paths to disallow (admin areas, search results, duplicate-content sections), the location of your sitemap, and any crawl-delay rules. The output is the standard text-based format ready to upload to your site's root.

Use the tool edit

How to use Robots.txt Generator edit

Follow these steps to use the tool:

Pick your default rule
Allow all crawlers (most sites), or block all crawlers (staging sites).
Add specific bot rules if needed
Block aggressive scrapers, allow specific search engines.
List paths to disallow
Common: /admin/, /wp-admin/, /search/, /tag/, /?utm=*
Add your sitemap URL
Full URL with protocol: https://yoursite.com/sitemap.xml
Generate and download
Save the output as <code>robots.txt</code> and upload it to your site's root.

Details edit

⚠️ Important

robots.txt is a suggestion, not a security boundary. To truly hide pages, use authentication or noindex meta tags. Blocking pages here may still let them appear in search if linked elsewhere.

Frequently asked questions edit

What does robots.txt do?

It tells search-engine crawlers which paths on your site they may or may not fetch.

Where does it live?

At the exact root of your domain: https://yoursite.com/robots.txt. Subdirectory paths do not work.

Will it remove already-indexed pages?

No. To remove indexed pages, use the noindex meta tag on the page itself, then ask Google to recrawl.

Is robots.txt secure?

No, it is advisory. Bad bots ignore it. For real security, use authentication.

Can I block specific bots?

Yes, name them in the User-agent directive. Common ones: GPTBot, ClaudeBot, CCBot.

Use cases edit

Setting up a new website

Generate a sensible default robots.txt with sitemap location and standard exclusions.

Blocking specific bots

Excluding aggressive scrapers or AI training bots that hammer your site.

Hiding admin and staging areas

Preventing search engines from indexing /admin, /wp-admin, /staging, etc.

Pointing to your sitemap

The Sitemap directive in robots.txt is one of the most reliable ways to ensure crawlers find your sitemap.xml.

Crawl-budget management for large sites

Disallowing low-value URL parameters (sort, filter combinations) so Google focuses crawl budget on real content.

How it works edit

A robots.txt file is plain text with a specific syntax: User-agent directives target specific crawlers (or all crawlers with *), and Allow / Disallow directives specify which paths each crawler can or cannot fetch. The Sitemap directive is global and points to your XML sitemap.

The generator presents the syntax as a UI: pick the crawlers (Googlebot, Bingbot, etc.), check which paths to disallow, enter your sitemap URL, and the tool composes a syntactically correct robots.txt for you to paste into a file at /robots.txt.

Tips and best practices edit

robots.txt must live at the exact root: <code>https://yoursite.com/robots.txt</code>. Subdirectories do not work.
After uploading, test your robots.txt in Google Search Console's robots.txt tester to verify it parses correctly.
Disallowing a URL in robots.txt does not remove it from the index if it was already indexed. Use the <code>noindex</code> meta tag for that.
Be careful with wildcards. <code>Disallow: /*?</code> blocks every URL with a query parameter, sometimes desirable, sometimes catastrophic.

Common mistakes edit

Accidentally blocking the entire site

<code>User-agent: *</code> followed by <code>Disallow: /</code> blocks everything. This is a common copy-paste mistake from staging sites.

Trusting robots.txt for security

robots.txt is advisory; bad actors ignore it. Use authentication for anything actually private.

Forgetting the sitemap directive

It is the most reliable way to ensure crawlers find your sitemap.

Other free seo tools available on ToolzPedia:

🔍

Keyword Density Checker

Analyze keyword frequency and density in your content to optimize for SEO rankings.

Free Use tool →

🏷️

Meta Tag Generator

Generate perfect SEO meta tags, Open Graph and Twitter Card tags for any page.

Popular Use tool →

🗺️

Sitemap Generator

Generate an XML sitemap for your website to help Google index all your pages.

Free Use tool →

Robots.txt Generator

Use the tool edit

How to use Robots.txt Generator edit

Details edit

Frequently asked questions edit

Use cases edit

How it works edit

Tips and best practices edit

Common mistakes edit

Related tools edit

Keyword Density Checker

Meta Tag Generator

Sitemap Generator

See also edit