If you’ve ever wondered why some of your best pages aren’t showing up in Google, your SEO sitemaps might be the culprit. I’ve audited hundreds of Singapore websites over the years, and sitemap issues rank among the most common (and most overlooked) technical SEO problems I find. The fix is usually straightforward. But you need to know what to look for.
Think of your sitemap like the directory board in a shopping mall. Without it, visitors can still wander around and eventually find shops. But with a clear, accurate directory, they get where they need to go faster. Search engine crawlers work the same way.
This guide walks you through 15 sitemap best practices that go beyond the basics. Whether you’re running a 50-page corporate site or a 10,000-SKU e-commerce store, these are the specific steps that will help your pages get crawled, indexed, and ranked.
Why Your Sitemap Deserves More Attention Than You’re Giving It
Most site owners set up a sitemap once and forget about it. That’s a mistake. Your sitemap is a living document that communicates directly with search engines about what matters on your site, what’s changed, and what should be prioritised.
For smaller sites with 20 or 30 pages, Google’s crawlers can usually discover everything through internal links alone. But the moment your site grows beyond a few hundred pages, or if your internal linking isn’t airtight, a well-optimised sitemap becomes essential. I’ve seen cases where fixing sitemap issues alone led to a 32% increase in indexed pages within three weeks.
A properly configured sitemap does three things: it tells search engines which pages exist, signals which ones matter most, and indicates when content was last updated. Get all three right, and you’re giving crawlers exactly the information they need to do their job efficiently.
For Singapore businesses specifically, this matters because many local sites compete in both English and Chinese search results. If your sitemap doesn’t properly declare your multilingual pages, you could be losing visibility in an entire language segment of your market.
15 Sitemap Best Practices for Stronger Technical SEO
1. Only Include Canonical URLs
This is the single most common sitemap mistake I encounter during audits. Your sitemap should contain only canonical URLs. That means if you have multiple URLs that serve the same or similar content, only the version you’ve designated as the canonical should appear in your sitemap.
Here’s a real example. An e-commerce client selling electronics had product pages accessible through multiple URL paths:
/products/wireless-earbuds-black/products/wireless-earbuds?colour=black/category/audio/wireless-earbuds-black
All three URLs served the same page. All three were in the sitemap. Google was crawling and trying to index all three, wasting crawl budget and creating duplicate content signals. After cleaning the sitemap to include only the canonical version, their indexed page count dropped by 40%, but their organic traffic increased by 18% because Google was now focusing on the right pages.
Action step: Export your sitemap URLs into a spreadsheet. Cross-reference them against your canonical tags. If any URL in your sitemap has a canonical tag pointing elsewhere, remove it from the sitemap immediately.
2. Update Your Sitemap Every Time Your Site Changes
A stale sitemap is worse than no sitemap at all. When your sitemap lists pages that no longer exist, you’re sending crawlers on a wild goose chase. They hit 404 errors, waste their allocated crawl budget on dead pages, and may deprioritise crawling the rest of your site as a result.
If you’re using WordPress with a plugin like Yoast or Rank Math, your sitemap updates automatically when you publish or delete content. But if you’re running a custom-built site (common among Singapore’s larger enterprises), you’ll need a process to regenerate your sitemap whenever content changes.
For dynamic sites with frequent inventory changes, like property listings or job boards, I recommend regenerating the sitemap at least daily. For most business websites, weekly regeneration is sufficient. The key is that your sitemap should never reference a page that returns anything other than a 200 status code.
Action step: Set a monthly calendar reminder to manually review your sitemap, even if it auto-generates. Run every URL through a bulk HTTP status checker. Flag and remove anything that isn’t returning a clean 200 response.
3. Respect the 50,000 URL and 50MB Limits
Google enforces hard limits on sitemaps: a maximum of 50,000 URLs per file, and a maximum uncompressed file size of 50MB. Exceed either limit and Google will simply stop reading the file partway through.
For most Singapore SME websites, you’ll never hit these limits. But if you’re running a large e-commerce store, a classifieds platform, or a content-heavy publication, you’ll need to split your sitemap into multiple files and tie them together with a sitemap index file.
Here’s what a sitemap index file looks like in practice:
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://www.example.com/sitemap-products.xml</loc>
<lastmod>2026-12-15</lastmod>
</sitemap>
<sitemap>
<loc>https://www.example.com/sitemap-blog.xml</loc>
<lastmod>2026-12-18</lastmod>
</sitemap>
<sitemap>
<loc>https://www.example.com/sitemap-categories.xml</loc>
<lastmod>2026-11-30</lastmod>
</sitemap>
</sitemapindex>
This structure also helps you organise your sitemap logically, which brings crawling efficiency benefits even if you’re well under the limits. Splitting by content type (products, blog posts, category pages) lets you monitor indexing rates for each section independently in Google Search Console.
Action step: Check your sitemap file size and URL count. If you’re above 10,000 URLs, consider splitting into multiple sitemaps organised by content type, even if you haven’t hit the limit yet. The organisational benefits are worth it.
4. Submit Your Sitemap Through Google Search Console
Yes, Google will eventually discover your sitemap through your robots.txt file or by crawling your site. But “eventually” isn’t a strategy. Submitting your sitemap directly through Google Search Console gives you two advantages: faster discovery and diagnostic data.
Once submitted, Search Console shows you exactly how many URLs were submitted versus how many were actually indexed. If you submitted 500 URLs but only 320 are indexed, that gap tells you something is wrong. Maybe some pages are returning errors. Maybe Google considers some of them low quality. Maybe canonical issues are at play.
This diagnostic data is gold. I’ve used it to identify crawl issues that would have taken weeks to surface otherwise. One Singapore F&B client had 60% of their menu pages unindexed because of a misconfigured server that was returning soft 404s. We only caught it because the Search Console sitemap report flagged the discrepancy.
Don’t forget Bing Webmaster Tools either. Bing still accounts for a meaningful share of search traffic in Singapore, particularly among corporate users and older demographics. Submit your sitemap there too.
Action step: Log into Google Search Console, navigate to Sitemaps, and submit your sitemap URL. Check back after 48 hours to review the coverage report. If your indexed count is significantly lower than your submitted count, investigate the excluded URLs.
5. Place Your Sitemap in the Root Directory and Declare It in robots.txt
Search engines look for sitemaps in predictable locations. The standard convention is https://www.yoursite.com/sitemap.xml. If your sitemap lives at some obscure path like /assets/xml/sitemap-main-v2.xml, crawlers may never find it on their own.
Your robots.txt file should also explicitly declare the sitemap location. This is a one-line addition at the bottom of the file:
Sitemap: https://www.yoursite.com/sitemap.xml
This line serves as a signpost. Every time a crawler reads your robots.txt (which happens frequently), it gets reminded where your sitemap lives. Simple, effective, and takes 10 seconds to implement.
One thing I see surprisingly often with Singapore sites: the www and non-www versions of the domain pointing to different sitemaps, or worse, one version returning a 404 for the sitemap. Make sure your sitemap is accessible at whichever domain version you’ve set as canonical.
Action step: Open your robots.txt file (visit yoursite.com/robots.txt in a browser). Confirm the Sitemap directive is present and the URL is correct. Then visit the sitemap URL directly to confirm it loads without errors.
6. Never Include Noindex Pages in Your Sitemap
This one sounds obvious, but it’s remarkably common. A noindex tag tells Google “don’t index this page.” Including that same page in your sitemap tells Google “please index this page.” These are contradictory signals, and sending them simultaneously wastes crawl budget while confusing search engines about your intentions.
The pages most commonly guilty of this conflict are: login pages, thank-you pages after form submissions, internal search result pages, staging or test pages that accidentally went live, and paginated archive pages.
I ran an audit for a Singapore professional services firm last year where 23% of their sitemap URLs carried noindex tags. Nearly a quarter of their sitemap was essentially telling Google to ignore itself. After cleaning this up, their crawl stats in Search Console improved noticeably within two weeks, with Google spending more time on the pages that actually mattered.
Action step: Use Screaming Frog or a similar crawler to scan your sitemap URLs. Filter for any pages that return a noindex directive in either the meta robots tag or the X-Robots-Tag HTTP header. Remove every single one from your sitemap.
7. Create Dedicated Sitemaps for Images and Videos
If your site relies heavily on visual content, standard XML sitemaps aren’t enough. Google supports specialised sitemap extensions for images and videos that let you provide rich metadata about your media files.
For images, you can specify the image URL, caption, title, geographic location, and license information. For videos, you can include the title, description, thumbnail URL, duration, view count, and even the date the video was published.
This is particularly valuable for Singapore businesses in visually driven industries: interior design, F&B, fashion, real estate, and tourism. A property agent whose listing photos appear in Google Image search results gets an additional traffic channel that most competitors ignore entirely.
Here’s a simplified example of an image sitemap entry:
<url>
<loc>https://www.example.com/condo-listing-123</loc>
<image:image>
<image:loc>https://www.example.com/images/condo-123-living-room.jpg</image:loc>
<image:caption>Spacious living room with Marina Bay view</image:caption>
</image:image>
</url>
Action step: If your site has more than 50 unique images or any video content, create dedicated media sitemaps. Most CMS plugins can generate these automatically. For custom sites, you’ll need to build the generation logic or use a tool like Screaming Frog to create them.
8. Always Include the <lastmod> Tag with Accurate Dates
The <lastmod> tag tells search engines when a page was last modified. When used accurately, it helps crawlers prioritise recently updated content over stale pages. This is especially useful for sites that publish frequently, like news outlets, blogs, or e-commerce stores with rotating inventory.
The critical word here is “accurately.” I’ve seen sites where every single URL in the sitemap has the same <lastmod> date, usually the date the sitemap was last regenerated. This defeats the entire purpose. If Google sees that all 2,000 of your pages were supposedly modified on the same day, it learns to ignore your <lastmod> data entirely.
Each page’s <lastmod> value should reflect the actual date that page’s content was meaningfully changed. Not the date a comment was added. Not the date a plugin updated a sidebar widget. The date the actual body content or key metadata was modified.
For WordPress sites, this is usually handled correctly by SEO plugins. For custom CMS platforms, you’ll need to ensure your content management system tracks and exposes the last-modified date for each page.
Action step: Spot-check 10 random URLs in your sitemap. Compare the <lastmod> date against the actual last modification date of the page content. If they don’t match, fix your sitemap generation logic.
9. Organise Large Sitemaps by Content Category
Think of this like organising a hawker centre. You wouldn’t mix the drink stall signage with the nasi lemak stall’s menu. Each stall has its own clear identity, and customers know exactly where to go. Your sitemap should work the same way.
For a site with diverse content types, create separate sitemaps for each category:
sitemap-services.xmlfor your service pagessitemap-blog.xmlfor articles and guidessitemap-products.xmlfor product listingssitemap-locations.xmlfor location-specific pages
This structure gives you granular visibility in Google Search Console. You can see at a glance that 95% of your blog posts are indexed but only 60% of your product pages are. That immediately tells you where to focus your debugging efforts.
It also helps search engines understand your site’s architecture. When all your product URLs live in one sitemap, Google can infer the relationship between those pages more easily, which can improve how it understands your site’s topical structure.
Action step: If your site has more than 500 pages or more than three distinct content types, split your sitemap into category-specific files. Link them all through a sitemap index file. Submit the index file to Search Console.
10. Validate Your Sitemap Before Every Submission
A sitemap with syntax errors is like submitting a tax return with missing fields to IRAS. It gets rejected, or worse, partially processed in ways you didn’t intend. Before you submit or resubmit your sitemap, validate it.
Google Search Console will flag errors after submission, but by then you’ve already wasted a crawl cycle. It’s better to catch problems before they reach Google. Free online validators like XML Sitemap Validator or the W3C Markup Validation Service can check your sitemap’s XML syntax in seconds.
Common validation errors I encounter include:
- URLs containing special characters that aren’t properly encoded (ampersands are the usual culprit)
- Missing closing tags in the XML structure
- URLs using HTTP when the site has migrated to HTTPS
- Namespace declarations that are incorrect or missing
Action step: Run your sitemap through an XML validator before every submission. If you use automated sitemap generation, validate the output at least once a month to catch any issues introduced by CMS updates or plugin changes.
11. Audit and Remove Broken Links Ruthlessly
Every broken link in your sitemap is a wasted crawl request. Google allocates a finite crawl budget to your site, and every 404 or 500 error it encounters is budget spent on nothing. Worse, a sitemap full of broken links signals to Google that your site isn’t well maintained, which can affect how frequently and deeply it crawls you.
Broken links in sitemaps typically come from three sources: deleted pages that weren’t removed from the sitemap, URL structure changes that weren’t updated in the sitemap, and staging or test URLs that were accidentally included.
I worked with a Singapore education provider whose sitemap contained 340 URLs. 47 of them returned 404 errors, mostly from old course pages that had been removed over the past two years. After cleaning the sitemap and setting up proper 301 redirects for the deleted pages, their crawl efficiency improved by 28% (measured by the number of pages crawled per day in Search Console).
Action step: Use Screaming Frog in “List Mode” to crawl every URL in your sitemap. Filter for any status code other than 200. Remove 404s from the sitemap entirely. For 301 redirects, update the sitemap URL to point to the final destination, not the redirecting URL.
12. Curate Your Sitemap for Quality, Not Quantity
Your sitemap is not a complete inventory of every URL on your site. It’s a curated list of the pages you want search engines to index and rank. There’s a meaningful difference.
Pages that should typically be excluded from your sitemap:
- Thin content pages with fewer than 200 words and no unique value
- Parameter-based URLs (e.g.,
?sort=price&page=3) - Tag and author archive pages (unless they have substantial unique content)
- Utility pages like privacy policies and terms of service (these get indexed through internal links anyway)
- Outdated promotional or campaign landing pages
A focused sitemap concentrates your crawl budget on the pages that drive business results. For a Singapore law firm, that means your practice area pages, key blog articles, and lawyer profile pages. Not your 15 paginated blog archive pages or your cookie consent policy.
One practical test: for every URL in your sitemap, ask yourself, “Would I be happy if this page ranked #1 for its target keyword?” If the answer is no, it probably doesn’t belong in your sitemap.
Action step: Export your sitemap URLs. Review each one against your analytics data. Any page with zero organic sessions over the past 6 months, thin content, or no clear search intent should be evaluated for removal.
13. Ensure Mobile URLs Are Properly Represented
Google has been using mobile-first indexing for all sites since 2023. This means Google predominantly uses the mobile version of your content for indexing and ranking. Your sitemap needs to reflect this reality.
If your site uses responsive design (the same URLs serve both desktop and mobile users), you’re largely fine. Your existing sitemap covers both versions. But if you’re running a separate mobile site on a subdomain like m.yoursite.com, those mobile URLs need to be in your sitemap, and ideally, they should be the primary URLs listed.
Even with responsive design, verify that the pages in your sitemap render correctly on mobile. A page that’s technically accessible on mobile but has overlapping elements, tiny tap targets, or content hidden behind JavaScript may not get indexed properly despite being in your sitemap.
In Singapore, mobile accounts for roughly 72% of web traffic. If your sitemap is optimised but your mobile experience is broken, you’re building on a shaky foundation.
Action step: Test 10 random pages from your sitemap using Google’s Mobile-Friendly Test tool. If any fail, fix the mobile rendering issues before worrying about further sitemap optimisation. The sitemap can only help pages that Google can actually process.
14. Use Priority Tags Strategically, Not Universally
The <priority> tag assigns a value between 0.0 and 1.0 to each URL, suggesting its relative importance compared to other pages on your site. Google has publicly stated that they largely ignore this tag. So why include it?
Because “largely ignore” isn’t the same as “completely ignore.” And other search engines, including Bing, may still factor it in. More importantly, the exercise of assigning priority values forces you to think critically about your site’s page hierarchy.
The mistake most people make is assigning every page a priority of 1.0. If everything is top priority, nothing is. Instead, use a tiered approach:
- 1.0: Homepage only
- 0.8: Core service or product category pages
- 0.6: Individual product pages or key blog articles
- 0.4: Supporting content, FAQs, about pages
- 0.2: Archive pages, older blog posts
For a Singapore e-commerce site running a Chinese New Year promotion, you might temporarily bump your CNY collection page to 0.9 and your seasonal landing page to 0.8. After the campaign ends, adjust them back down. This kind of intentional management, even if the direct SEO impact is modest, keeps your sitemap purposeful.
Action step: Review your sitemap’s priority values. If every page is set to the same number (or if priority tags are missing entirely), implement the tiered structure above. Revisit quarterly or during major campaigns.
15. Use Hreflang Sitemaps for Multilingual Sites
This one is especially relevant for Singapore businesses. With our multilingual population and regional market reach, many sites serve content in English, Chinese, Malay, or Tamil. If you’re targeting audiences across Southeast Asia, you might also have pages in Bahasa Indonesia, Thai, or Vietnamese.
The hreflang attribute tells Google which language and regional version of a page to show to which users. While you can implement hreflang through HTML tags or HTTP headers, declaring them in your sitemap is often the cleanest approach for large multilingual sites.
Here’s what an hreflang sitemap entry looks like:
<url>
<loc>https://www.example.com/en/services</loc>
<xhtml:link rel="alternate" hreflang="en-sg" href="https://www.example.com/en/services"/>
<xhtml:link rel="alternate" hreflang="zh-sg" href="https://www.example.com/zh/services"/>
<xhtml:link rel="alternate" hreflang="ms-sg" href="https://www.example.com/ms/services"/>
</url>
Without proper hreflang implementation, Google might show your Chinese-language page to English-speaking searchers, or your Singapore-targeted page to users in Malaysia. Both scenarios lead to poor user experience and wasted ranking potential.
I’ve seen this issue cost a Singapore fintech company an estimated 15% of their organic traffic from Chinese-language searches. Their content existed, but Google didn’t know which version to serve to whom. Fixing the hreflang declarations in their sitemap resolved the issue within six weeks.
Action step: If your site serves content in more than one language, audit your hreflang implementation. Use the hreflang Tags Testing Tool by Merkle to check for errors. If you have more than 50 multilingual pages, implement hreflang through your sitemap rather than individual page tags for easier maintenance.
A Quick Sitemap Audit Checklist
Before you close this tab, run through this checklist against your own site. It takes about 30 minutes and will surface the most impactful issues:
- Visit
yoursite.com/sitemap.xml. Does it load? Is it valid XML? - Check your robots.txt for the Sitemap directive.
- Open Google Search Console. Compare submitted URLs versus indexed URLs. Is the gap larger than 20%?
- Run your sitemap URLs through a bulk HTTP status checker. Flag anything that isn’t a 200.
- Spot-check 10 URLs for canonical tag consistency. Does the canonical match the sitemap URL?
- Check for noindex pages in the sitemap using Screaming Frog’s list mode.
- Verify
<lastmod>dates are accurate, not all set to the same date. - Test 5 random sitemap URLs on mobile. Do they render correctly?
- If your site is multilingual, verify hreflang declarations are present and error-free.
If you find issues in three or more of these areas, your sitemap needs serious attention. The good news is that most fixes are straightforward and can be implemented within a day.
Your Sitemap Is a Conversation with Google
Every sitemap you submit is essentially a message to search engines: “Here’s what matters on my site, here’s what’s changed, and here’s how it’s all organised.” The clearer and more accurate that message is, the better Google can do its job, and the better your pages will perform in search results.
The 15 practices above aren’t theoretical. They’re the exact checks I run during every technical SEO audit at bestseo.sg. Some are quick wins you can implement this afternoon. Others require ongoing maintenance. All of them compound over time.
If you’ve gone through this guide and realised your sitemap needs more than a quick fix, or if you want a professional audit of your site’s technical SEO health, reach out to us. We’ll take a look at your sitemap, crawl data, and indexing patterns, then give you a clear picture of what needs fixing and what it’ll take to get there. No fluff, just the technical specifics your site needs.
Suggested internal links: [Technical SEO Audit Services], [Google Search Console Setup Guide], [XML Sitemap Generator Tools], [Robots.txt Best Practices], [Mobile SEO Optimisation Guide]
