How to Prevent ChatGPT Bot From Crawling Your Website?

To disallow (or allow) ChatGPT from crawling your website, you can add directives in your robots.txt file for the following user-agents:

"User-agent: GPTBot" — for OpenAI's web crawler bot;
"User-agent: ChatGPT-User" — for ChatGPT plugins bot.

For instance, if you wish to prevent both, OpenAI's web crawler and ChatGPT plugins from accessing your website entirely, you can add these directives to your robots.txt file:

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

This will instruct the bots not to crawl your entire website. Alternatively, you can restrict access to specific portions of your website using directives like these:

User-agent: GPTBot
Allow: /directory-1/
Disallow: /directory-2/

# ...

It's important to note that in the absence of a Disallow directive, it's generally assumed that a user-agent is allowed to access a section of the website. The purpose of the Allow directive is to explicitly state that a specific directory is accessible to a user-agent, particularly when there might be conflicting rules or a need for clear access permissions.

This post was published 2 years ago by Daniyal Hamid. Daniyal currently works as the Head of Engineering in Germany and has 20+ years of experience in software engineering, design and marketing. Please show your love and support by sharing this post.