RevealTheme logo

robots.txt生成器

以可视化方式生成robots.txt文件。为主流搜索引擎和AI机器人配置抓取规则。

User-agent: *
Allow: /
Disallow: /admin/
Disallow: /api/

Sitemap: https://example.com/sitemap.xml

如何使用本工具

  1. 1

    Tick 'Allow all standard crawlers' to add an explicit Allow: / line, or untick it to leave it out.

  2. 2

    Enter the paths you want crawlers to skip in the disallowed box, one per line (for example /admin/ or /api/).

  3. 3

    Optionally tick 'Block AI training crawlers' to add Disallow rules for GPTBot, ClaudeBot, PerplexityBot, CCBot and Google-Extended, and paste your sitemap URL.

  4. 4

    Select the generated text in the preview box, copy it, and save it as a file named robots.txt at your domain root.

什么是robots.txt生成器?

robots.txt告诉网络爬虫哪些页面可以访问、哪些不可以。它是一种自愿性协议:行为规范的爬虫会遵守它;恶意机器人则会无视它。本生成器创建符合标准的robots.txt文件。对于复杂规则,请查阅robotstxt.org和完整的规范。

常见使用场景

  • Keeping crawlers out of admin, API, or internal paths like /admin/ and /api/ while leaving the rest of the site open.

  • Generating a starter robots.txt for a brand-new site that does not have one yet.

  • Blocking the five named AI training crawlers (GPTBot, ClaudeBot, PerplexityBot, CCBot, Google-Extended) so your content is not scraped for model training.

  • Adding or correcting the Sitemap: line so search engines can discover your sitemap.xml.

  • Drafting rules quickly to paste into a CMS or static-site config, then hand-editing for anything advanced.

  • Teaching teammates what a minimal, readable robots.txt looks like before they edit the real one.

常见问题

我应该把robots.txt上传到哪里?
上传到你网站的根目录:example.com/robots.txt。它必须位于根目录;放在子目录里不起作用。
我应该屏蔽AI机器人吗?
这取决于你。允许它们意味着ChatGPT/Perplexity可以引用你的内容。屏蔽它们可避免训练数据被抓取,但会降低可见度。

相关工具