A robots.txt file helps control how search engines crawl and index your website. While not mandatory, it's an essential part of technical SEO that ensures search engines respect your site’s structure.

In this guide, we’ll walk through:

Creating a basic robots.txt file in a Next.js app with Payload CMS.
Defining rules for different web crawlers.
Expanding the rules for advanced use cases.

1. Creating a basic robots.txt file

The simplest way to add a robots.txt file is by creating it in your app directory. In Next.js, this is done programmatically to ensure dynamic control.

Step 1: Create a robots.ts

Inside your app folder, create a new robots.ts file.

Step 2: Define metadata for robots.txt

Inside robots.ts, import MetadataRoute from Next.js:

1import type { MetadataRoute } from 'next'

Now, export a default function called robots and set its return type:

1import type { MetadataRoute } from 'next'
2import { getServerSideURL } from '@/utilities/getURL'
3
4export default function robots(): MetadataRoute.Robots {
5  const url: string = getServerSideURL()
6
7  return {
8    rules: {
9      userAgent: '*',
10      allow: '/',
11      disallow: '/admin',
12    },
13    sitemap: `${url}/sitemap.xml`,
14  }
15}

What this does:

userAgent: '*' → Applies rules to all search engines.
allow: '/' → Allows indexing for all pages.
disallow: '/admin' → Blocks crawlers from indexing the admin panel.
sitemap: 'https://yourwebsite.com/sitemap.xml' → Informs crawlers where to find your sitemap.

Step 3: Test the output

Run your Next.js app and visit: http://localhost:3000/robots.txt

You should see the following:

1User-agent: *
2Allow: /
3Disallow: /admin
4Sitemap: https://yourwebsite.com/sitemap.xml

Your robots.txt file is now live! Next, let’s look at advanced rules.

2. Creating an advanced robots.txt file

If you need want to be more granular in the permissions that you give to web crawlers, you can modify the rules section for different search engines.

Step 1: Update robots.ts for advanced rules

Instead of using a single rule, update the function to return an array of rules:

1import type { MetadataRoute } from 'next'
2import { getServerSideURL } from '@/utilities/getURL'
3
4export default function robots(): MetadataRoute.Robots {
5  const url: string = getServerSideURL()
6
7  return {
8    rules: [{
9      userAgent: 'Googlebot',
10      allow: '/',
11      disallow: '/admin',
12    },
13      {userAgent: ['Bingbot', 'SemrushBot'],
14      disallow: '/'
15      }
16    ], 
17    sitemap: `${url}/sitemap.xml`,
18  }
19}

What this does

Allows Googlebot to index all pages except /admin.
Blocks Bingbot and SEMrushBot from crawling the site entirely.

Step 2: Test the output

Visit: http://localhost:3000/robots.txt

You should see the following:

1User-agent: Googlebot
2Allow: /
3Disallow: /admin
4
5User-agent: Bingbot
6User-agent: SEMrushBot
7Disallow: /
8
9Sitemap: https://localhost:3000/sitemap.xml

Your robots.txt file now handles multiple crawlers dynamically!

Final thoughts

A robots.txt file is a simple but powerful tool for SEO. Whether you're allowing, blocking, or customizing rules for different search engines, Next.js makes it easy to manage dynamically.