Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sitemap sometimes not generated correctly #10171

Closed
1 task done
ktym4a opened this issue Feb 20, 2024 · 12 comments · Fixed by #10179
Closed
1 task done

Sitemap sometimes not generated correctly #10171

ktym4a opened this issue Feb 20, 2024 · 12 comments · Fixed by #10179
Labels
- P4: important Violate documented behavior or significantly impacts performance (priority) regression

Comments

@ktym4a
Copy link
Contributor

ktym4a commented Feb 20, 2024

Astro Info

Astro                    v4.4.1
Node                     v20.10.0
System                   macOS (arm64)
Package Manager          npm
Output                   static
Adapter                  none
Integrations             @astrojs/sitemap

If this issue only occurs in one browser, which browser is a problem?

No response

Describe the Bug

After #9846, sometimes the structure is incorrect when generating the sitemap. (even not set prefix)

but ${pefix}-0.xml is always the correct structure.

sitemap.mov

What's the expected result?

Always generated with the correct structure

Link to Minimal Reproducible Example

N/A

Participation

  • I am willing to submit a pull request for this issue. (If I can, maybe need support)
@github-actions github-actions bot added the needs triage Issue needs to be triaged label Feb 20, 2024
@ktym4a
Copy link
Contributor Author

ktym4a commented Feb 20, 2024

cc @ematipico
I currently have no idea how to fix this.

@ematipico
Copy link
Member

It's definitely a regression of the latest changes, we'll have to debug them

@ktym4a
Copy link
Contributor Author

ktym4a commented Feb 20, 2024

Do you know if there is any difference between await fixture.build(); and npx astro build?

I have run the test about 30 times locally and this does not happen.

but by npx astro build, there is about a like 50% of this happen.

I have also checked that if I hardcode an incorrect structure and run the test, the test will fail. Like:

it('Index file load correct sitemap', async () => {
	const data = await readXML('<?xml version="1.0" encoding="UTF-8"?><sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><sitemap><loc>http://example.com/sitemap-0.xml</loc></sitemap>');
	const sitemapUrl = data.sitemapindex.sitemap[0].loc[0];
	assert.strictEqual(sitemapUrl, 'http://example.com/sitemap-0.xml');

	const prefixData = await readXML(fixture.readFile(`/${prefix}index.xml`));
	const prefixSitemapUrl = prefixData.sitemapindex.sitemap[0].loc[0];
	assert.strictEqual(prefixSitemapUrl, `http://example.com/${prefix}0.xml`);
});

then,

▶ Prefix support
  ▶ Static
    ✔ Content is same (1.386125ms)
    ✖ Index file load correct sitemap (0.387125ms)
      Error: Unclosed root tag
      Line: 0
      Column: 166
      Char:
          at error (/Users/ktym4a/astro/node_modules/.pnpm/[email protected]/node_modules/sax/lib/sax.js:652:10)
          at strictFail (/Users/ktym4a/astro/node_modules/.pnpm/[email protected]/node_modules/sax/lib/sax.js:678:7)
          at end (/Users/ktym4a/astro/node_modules/.pnpm/[email protected]/node_modules/sax/lib/sax.js:659:47)
          at SAXParser.write (/Users/ktym4a/astro/node_modules/.pnpm/[email protected]/node_modules/sax/lib/sax.js:976:14)
          at SAXParser.close (/Users/ktym4a/astro/node_modules/.pnpm/[email protected]/node_modules/sax/lib/sax.js:157:38)
          at exports.Parser.Parser.parseString (/Users/ktym4a/astro/node_modules/.pnpm/[email protected]/node_modules/xml2js/lib/parser.js:337:42)
          at Parser.parseString (/Users/ktym4a/astro/node_modules/.pnpm/[email protected]/node_modules/xml2js/lib/parser.js:5:59)
          at exports.parseString (/Users/ktym4a/astro/node_modules/.pnpm/[email protected]/node_modules/xml2js/lib/parser.js:383:19)
          at file:///Users/ktym4a/astro/packages/integrations/sitemap/test/test-utils.js:23:4
          at new Promise (<anonymous>)

@ktym4a
Copy link
Contributor Author

ktym4a commented Feb 21, 2024

Is there a function that is called after cli function?

After some debugging, I think It is possibly due to something other than sitemap integration. (Not sure)

Throwing an error after runCommand for debugging, It seems that the sitemap structure is always correct.

export async function cli(args: string[]) {
	const flags = yargs(args, { boolean: ['global'], alias: { g: 'global' } });
	const cmd = resolveCommand(flags);
	try {
		await runCommand(cmd, flags);
+		throw new Error('Just testing...');
	} catch (err) {
		const { throwAndExit } = await import('./throw-and-exit.js');
		await throwAndExit(cmd, err);
	}
}

@fisherfrontier
Copy link

Also seeing this problem as of @astrojs/sitemap 3.1.0. I'm getting this output in sitemap-index.xml (missing closing tag </sitemapindex>):

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <sitemap>
        <loc>https://www.fisherfrontier.com/sitemap-0.xml</loc>
    </sitemap>

Per Google Search Central, should include closing tag </sitemapindex>, like this:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <sitemap>
        <loc>https://www.fisherfrontier.com/sitemap-0.xml</loc>
    </sitemap>
</sitemapindex>

@ktym4a
Copy link
Contributor Author

ktym4a commented Feb 21, 2024

This is the same thing I am saying.

@ematipico ematipico added - P4: important Violate documented behavior or significantly impacts performance (priority) regression and removed needs triage Issue needs to be triaged labels Feb 21, 2024
@Princesseuh
Copy link
Member

I'd suggest reverting the PR if we cannot figure out a fix quickly.

@ktym4a
Copy link
Contributor Author

ktym4a commented Feb 21, 2024

Currently I have no idea how to fix so I agree with that. sorry

@Princesseuh
Copy link
Member

Currently I have no idea how to fix so I agree with that. sorry

Absolutely no worries! It happens, you couldn't have known.

@ematipico
Copy link
Member

I'll have a look in a few hours

@ematipico
Copy link
Member

It seems to be an issue related to stream, and I'll be honest, I don't know how to fix it. It's really weird that this library doesn't allow to change the file name. Only way possible is by using streams...

We should revert the PR.

@ktym4a
Copy link
Contributor Author

ktym4a commented Feb 21, 2024

Thank you.
I actually believe that Google Search Console says Couldn't fetch sitemap is pretty much an edge case.

Fortunately for me, I am deploying manually now using wrangler, so there are things I can do, like rename the sitemap after the build.

but I will try again when I have more time to see if there is a way to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
- P4: important Violate documented behavior or significantly impacts performance (priority) regression
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants