You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(3)(+0486231): POST /web?single=1 from 127.0.0.1 "node-fetch/1.0 (+https://github.com/bitinn/node-fetch)"
(3)(+0000019): HTTP GET https://www.reuters.com/investigates/special-report/usa-riteaid-software/
(1)(+0000172): Error: HTTP request to https://www.reuters.com/investigates/special-report/usa-riteaid-software/ rejected with status 401
InternalServerError: An error occurred retrieving the document
at Object.throw (/Users/hochsten/github.com/zotero/translation-server/node_modules/koa/lib/context.js:97:11)
at WebSession.handleURL (/Users/hochsten/github.com/zotero/translation-server/src/webSession.js:219:19)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Object.handle (/Users/hochsten/github.com/zotero/translation-server/src/webEndpoint.js:85:3)
at async bodyParser (/Users/hochsten/github.com/zotero/translation-server/node_modules/koa-bodyparser/index.js:95:5)
at async module.exports (/Users/hochsten/github.com/zotero/translation-server/src/cors.js:22:3)
When I inspect the page with CURL curl https://www.reuters.com/investigates/special-report/usa-riteaid-software/ I get a 401 indeed with content:
<html><head><title>reuters.com</title><style>#cmsg{animation: A 1.5s;}@keyframes A{0%{opacity:0;}99%{opacity:0;}100%{opacity:1;}}</style></head><body style="margin:0"><p id="cmsg">Please enable JS and disable any ad blocker</p><script data-cfasync="false">var dd={'rt':'c','cid':'AHrlqAAAAAMA663TFE6abY4AncHwgw==','hsh':'2013457ADA70C67D6A4123E0A76873','t':'bv','s':46356,'e':'f821f4289bd2422ec97e8f6d79d54fd4a41b74ccbb9abab0df58bc633802e6e3','host':'geo.captcha-delivery.com','cookie':'l0gxS3PlDjwTQb88t6e_Izbuj0VFzjmcr95~8KEkQ_fCvBakUNEPGMlKibBTGkRwsFcm1sa4EKb3~5_mmnt6cImKDxRdqBdiOSY8sQ50dxQECrruBobI44~BBVuKdLky'}</script><script data-cfasync="false" src="https://ct.captcha-delivery.com/c.js"></script></body></html>
The text was updated successfully, but these errors were encountered:
What strategies I can use for web pages that use "anti-ad blocker" tactics? E.g. when trying to retrieve the metadata for https://www.reuters.com/investigates/special-report/usa-riteaid-software/ I get
When I inspect the page with CURL
curl https://www.reuters.com/investigates/special-report/usa-riteaid-software/
I get a 401 indeed with content:The text was updated successfully, but these errors were encountered: