You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try to crawl subpages from a main page based on an Xpath expression.
As I can't use window.location.href to crawl additional pages, it throws "Execution context was destroyed". I try to use the ctx.Lib.addLink.
After reading the code of browsertrix-crawler, it seems addLink callback is not set in my case. It seems also, when addLink is set, it is restricted by the scopeType.
I try to crawl subpages from a main page based on an Xpath expression.
As I can't use window.location.href to crawl additional pages, it throws "Execution context was destroyed". I try to use the ctx.Lib.addLink.
After reading the code of browsertrix-crawler, it seems addLink callback is not set in my case. It seems also, when addLink is set, it is restricted by the scopeType.
Url to crawl : https://group.bnpparibas/toutes-actualites/communique-de-presse
Behavior to crawl additional pages (the first 8 articles)
`
class BnpCommuniquesdePresseBehavior {
static id = "BnpCommuniquesdePresse";
}
`
The docker command line
docker run -p 6080:6080 -p 9223:9223 -v c:\tmp\crawls\:/crawls/ -v c:\tmp\custom-behaviors\:/custom-behaviors/ -it webrecorder/browsertrix-crawler:latest crawl --url https://group.bnpparibas/toutes-actualites/communique-de-presse --generateWACZ final-to-warc --text --wait-until domcontentloaded --screenshot thumbnail,view,fullPage --scopeType page --customBehaviors /custom-behaviors/ --pageLimit 10 --screencastPort 9223 --profile "/crawls/profiles/group.bnpparibas.tar.gz" --behaviors siteSpecific
The text was updated successfully, but these errors were encountered: