You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The ability to scan sitemap.xml files in Purple A11y is powerful. However, the sitemaps generally are too big to be effectively crawled by this script.
I've created a script that I think can help make existing sitemap.xml files more powerful.
Scanning just a handful of sites is a problem. Scanning all web pages in a site also brings challenges with it, particularly for larger sites. To have confidence in accessibility, a random sampling of pages, should allow us to get statistical certainty of our knowledge about the accessibility of a whole site.
The trouble is that there are often too many URLs, and you have site-maps of sitemaps.
My script which aggregates the XML files into just one and then removes all the stuff that we don’t want to be analyzing (.doc, .pdf, .zip, ext.) This produces I have a random sampling of URLs. This XML file can be used in the future to test if the exact same URLs have improved over time (or not). I’m capping sitemap size at 2000 URLs as that is a pretty decent sampling for the sites we work with.
It could be enhanced in the future to ensure that files like the home page, search page, representative landing pages, and any unusual pages are included in the scan. This could be something that is just appended.
Are there lists of sitemap.xml tools that folks find useful?
The text was updated successfully, but these errors were encountered:
The ability to scan sitemap.xml files in Purple A11y is powerful. However, the sitemaps generally are too big to be effectively crawled by this script.
I've created a script that I think can help make existing sitemap.xml files more powerful.
Scanning just a handful of sites is a problem. Scanning all web pages in a site also brings challenges with it, particularly for larger sites. To have confidence in accessibility, a random sampling of pages, should allow us to get statistical certainty of our knowledge about the accessibility of a whole site.
The trouble is that there are often too many URLs, and you have site-maps of sitemaps.
My script which aggregates the XML files into just one and then removes all the stuff that we don’t want to be analyzing (.doc, .pdf, .zip, ext.) This produces I have a random sampling of URLs. This XML file can be used in the future to test if the exact same URLs have improved over time (or not). I’m capping sitemap size at 2000 URLs as that is a pretty decent sampling for the sites we work with.
It could be enhanced in the future to ensure that files like the home page, search page, representative landing pages, and any unusual pages are included in the scan. This could be something that is just appended.
Are there lists of sitemap.xml tools that folks find useful?
The text was updated successfully, but these errors were encountered: