How to make large dynamic sitemaps with Next.js and next-sitemap
Development
In After, a platform to create beautiful online memorials, each memorial has its own dynamic URL. Each hour, we have dozens of new memorials that need to be indexed for family members and friends to easily find them, so we need to keep our sitemaps up to date.
How to dynamically generate sitemaps for thousands of URLs of user-generated content?
An introduction to sitemaps
A sitemap is an XML file containing the list of indexable URLs of a domain.
When sitemaps become large, they are split into 1 sitemap index file that point to multiple sitemap files. Learn more about splitting sitemaps with Google’s documentation.
next-sitemap is a library that conveniently generates the sitemap XML document after reading the Next.js build manifests or when given a list of URLs.
Check out some real examples, like the Google sitemap index or the sitemap of this website.
Static sitemaps generated at build time with next-sitemap
Static routes generated at build time are automatically picked up by next-sitemap. That is the case for both static pages or paths generated by getStaticPaths
. It works out of the box!
You may add other options, like paths to exclude
, additionalPaths
or generateRobotsTxt
.
Then, you’d automate its generation after building the project. To do that, simply add it to the postbuild
step in package.json
:
Adding additional info to a statically generated sitemap
If you need to add additional information to a sitemap, like last modification dates, you might need to call an API endpoint from the sitemap generation config file.
That is the case for the sitemap of this website. I wanted each item to contain lastmod
, the date of the last modification, so that Google can crawl again the post pages when they are updated.
Check out the configuration here:
How to build large dynamic sitemaps at runtime
Let’s say you have user generated URLs. You might go with pulling all URLs at build time in the next-sitemap config file, but then your sitemaps would only be updated when deploying. So let’s switch the approach to generate them on demand.
You’ll need new routes to render the sitemap index and each of the sitemap pages. Sitemaps should be at the root level, with clean URLs like /dynamic-sitemap.xml
and /dynamic-sitemap-0.xml
, /dynamic-sitemap-1.xml
, etc. Since Next.js doesn’t let us do dynamic page names like dynamic-sitemap-[page].ts
, we can leverage rewrites.
Create the following pages:
Then, add the rewrites in the Next.js config:
next-sitemap provides two APIs to generate server side sitemaps:
getServerSideSitemapIndex
to generate the sitemap index file.
getServerSideSitemap
to generate a single sitemap file.
For the index file, we just need to pull the amount of sitemap pages that will exist, and pass their URLs to getServerSideSitemapIndexLegacy
.
For the individual sitemaps, we need to fetch their corresponding page and pass the URLs getServerSideSitemapLegacy
.
Caching the dynamic sitemaps
Since the sitemaps are hitting our API or DB to load a lot of items, we don’t want to execute those queries too often.
With the Cache-Control
header, Next.js allows caching at the framework level the result of server-side functions, including getServerSideProps
. It works automatically when deployed to Vercel. Otherwise, you’ll need to set it up with Redis or similar.
Learn more about Vercel caching here. Note that the response size can’t exceed 10 MB!
Real-world example
Check out the dynamic sitemap index at After, https://after.io/memorial-sitemap.xml, and one of the sitemap pages, https://after.io/memorial-sitemap-0.xml. These are generated on the fly and cached with the strategy explained above!