You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hey, I am trying to run the scraper on my Local . In my config when I given the url which is live then the scraper indexing my document , but when I change the url to local url it's don't scrape the documentation . I convert the local port url to ngrock url and still don't scrape not indexing but for prod live url it's working fine (somewhere giving same result more then one on search hits) .
In local it's indexing only normal HTML content and some script and css but not the content of my document which is build by JavaScript .
here is my config file
` {
"index_name": "payment-page",
"js_render": true,
"js_wait": 10,
"use_anchors": false,
"user_agent": "Custom Bot",
"start_urls": [
"https://a65e-103-159-11-202.in.ngrok.io/payment-page/android/overview/pre-requisites",
"https://a65e-103-159-11-202.in.ngrok.io/payment-page/android/base-sdk-integration/session",
"https://a65e-103-159-11-202.in.ngrok.io/payment-page/android/base-sdk-integration/order-status-api",
used the commend to run the Scraper docker run -it --env-file=/my/clone/scraper/located/path/.env -e "CONFIG=$(cat config.json | jq -r tostring)" d2ebdc22bee2a9f6513e68457d9a3825850f325449a225bc6cde1a1f7339e1e4
my changes broser_handler.py (have to made the changes for rendering JS content on my documentation , before facing same issue for lived url also)
`import re
import os
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from ..custom_downloader_middleware import CustomDownloaderMiddleware
from ..js_executor import JsExecutor
hey, I am trying to run the scraper on my Local . In my config when I given the url which is live then the scraper indexing my document , but when I change the url to local url it's don't scrape the documentation . I convert the local port url to ngrock url and still don't scrape not indexing but for prod live url it's working fine (somewhere giving same result more then one on search hits) .
In local it's indexing only normal HTML content and some script and css but not the content of my document which is build by JavaScript .
here is my config file
` {
"index_name": "payment-page",
"js_render": true,
"js_wait": 10,
"use_anchors": false,
"user_agent": "Custom Bot",
"start_urls": [
"https://a65e-103-159-11-202.in.ngrok.io/payment-page/android/overview/pre-requisites",
"https://a65e-103-159-11-202.in.ngrok.io/payment-page/android/base-sdk-integration/session",
"https://a65e-103-159-11-202.in.ngrok.io/payment-page/android/base-sdk-integration/order-status-api",
}
`
used the commend to run the Scraper
docker run -it --env-file=/my/clone/scraper/located/path/.env -e "CONFIG=$(cat config.json | jq -r tostring)" d2ebdc22bee2a9f6513e68457d9a3825850f325449a225bc6cde1a1f7339e1e4
my changes broser_handler.py (have to made the changes for rendering JS content on my documentation , before facing same issue for lived url also)
`import re
import os
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from ..custom_downloader_middleware import CustomDownloaderMiddleware
from ..js_executor import JsExecutor
class BrowserHandler:
@staticmethod
def conf_need_browser(config_original_content, js_render):
group_regex = re.compile(r'(?P<(.+?)>.+?)')
results = re.findall(group_regex, config_original_content)
`
output after run the scraper
The text was updated successfully, but these errors were encountered: