Documentation subdomain returning HTTP 403 errors #14773
-
In our own project's documentation we refer to some pages in the Gihub documentation (docs.github.com) for more information. Our project build in CI verifies that links actually are live to prevent dead/incorrect links in our own documentation. However, recently, Github has started returning HTTP 403 errors, e.g. if we make the request with Postman or Python requests, we see an HTTP 403, while in a real browser things continue working. Is this a conscious decision? Is there a way we can still keep doing link checks? We can add the subdomain to an ignore risk, but obviously the quality of our documentation may suffer from this if Github documentation pages move at some point. |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments
-
Same problem here. |
Beta Was this translation helpful? Give feedback.
-
Same here. |
Beta Was this translation helpful? Give feedback.
-
In our case, we were using markdown-link-check with MegaLinter. @ivanramosnet found out that we can fix the problem by adding this configuration: {
"httpHeaders": [
{
"urls": ["https://docs.github.com/"],
"headers": {
"Accept-Encoding": "zstd, br, gzip, deflate"
}
}
]
} Maybe adding those headers to the request might work in other cases. At least using curl I do not get the 403 response. $ curl -i https://docs.github.com/
HTTP/2 403
x-azure-ref: 0wCJxYgAAAABtM0NN7R2MT6gHO4Zl4GGcTUFEMzBFREdFMDUwNQA1OTZkNzhhMi1jYTVmLTQ3OWQtYmNkYy0wODM1ODMzMTc0YjI=
accept-ranges: bytes
date: Tue, 03 May 2022 12:40:32 GMT
via: 1.1 varnish
x-served-by: cache-mad22047-MAD
x-cache: MISS
x-cache-hits: 0
x-timer: S1651581633.903885,VS0,VE8
strict-transport-security: max-age=31557600
<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Transitional//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'><html xmlns='http://www.w3.org/1999/xhtml'><head><meta content='text/html; charset=utf-8' http-equiv='content-type'/><style type='text/css'>body { font-family:Arial; margin-left:40px; }img { border:0 none; }#content { margin-left: auto; margin-right: auto }#message h2 { font-size: 20px; font-weight: normal; color: #000000; margin: 34px 0px 0px 0px }#message p { font-size: 13px; color: #000000; margin: 7px 0px 0px 0px }#errorref { font-size: 11px; color: #737373; margin-top: 41px }</style><title>Microsoft</title></head><body><div id='content'><div id='message'><h2>The request is blocked.</h2></div><div id='errorref'><span>0wCJxYgAAAABtM0NN7R2MT6gHO4Zl4GGcTUFEMzBFREdFMDUwNQA1OTZkNzhhMi1jYTVmLTQ3OWQtYmNkYy0wODM1ODMzMTc0YjI=</span></div></div></body></html> $ curl -i -H "Accept-Encoding: zstd, br, gzip, deflate" https://docs.github.com/
HTTP/2 302
cache-control: private, no-store
content-type: text/plain; charset=utf-8
location: /en
access-control-allow-origin: *
content-security-policy: default-src 'none';prefetch-src 'self';connect-src 'self';font-src 'self' data: githubdocs.azureedge.net;img-src 'self' data: github.githubassets.com githubdocs.azureedge.net placehold.it *.githubusercontent.com github.com;object-src 'self';script-src 'self';frame-src https://graphql.github.com/ https://www.youtube-nocookie.com;style-src 'self' 'unsafe-inline';child-src 'self'
x-dns-prefetch-control: off
expect-ct: max-age=0
x-frame-options: SAMEORIGIN
x-download-options: noopen
x-content-type-options: nosniff
x-permitted-cross-domain-policies: none
referrer-policy: strict-origin-when-cross-origin
x-xss-protection: 0
x-azure-ref: 07CJxYgAAAABShnaFpqe2SK/Whs+MsQV3TUFEMzBFREdFMDUxNwA1OTZkNzhhMi1jYTVmLTQ3OWQtYmNkYy0wODM1ODMzMTc0YjI=
accept-ranges: bytes
date: Tue, 03 May 2022 12:41:16 GMT
via: 1.1 varnish
x-served-by: cache-mad22060-MAD
x-cache: CONFIG_NOCACHE, MISS
x-cache-hits: 0
x-timer: S1651581676.005065,VS0,VE253
vary: Accept
strict-transport-security: max-age=31557600
content-length: 25
Found. Redirecting to /en |
Beta Was this translation helpful? Give feedback.
-
That is a great suggestion, @josecelano - using htmlproofer, this works for me:
|
Beta Was this translation helpful? Give feedback.
-
Nice find, using Sphinx these options are also configurable: https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-linkcheck_request_headers edit: Sphinx link checks seem to be passing again without specifying any request headers, actually |
Beta Was this translation helpful? Give feedback.
In our case, we were using markdown-link-check with MegaLinter.
@ivanramosnet found out that we can fix the problem by adding this configuration:
Maybe adding those headers to the request might work in other cases. At least using curl I do not get the 403 response.