We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi. Does somebody have any idea what could be the reason that on some keywords i get the data while on others i don't ?
for example, dog food:
import serpscrap keywords = ['dog food'] config = serpscrap.Config() config.set('scrape_urls', True) scrap = serpscrap.SerpScrap() scrap.init(config=config.get(), keywords=keywords) scrap.as_csv('/tmp/output')
2019-09-22 11:55:14,988 - root - INFO - Going to scrape 2 keywords with 1 proxies by using 1 threads. 2019-09-22 11:55:14,990 - scrapcore.scraping - INFO - [+] SelScrape[localhost][search-type:normal][https://www.google.com/search?] using search engine "google". Num keywords=1, num pages for keyword=[1] 2019-09-22 11:55:24,286 - scrapcore.scraper.selenium - INFO - https://www.google.com/search? 2019-09-22 11:55:55,364 - scrapcore.scraping - INFO - [google]SelScrape localhost - Keyword: "dog food" with [1, 2] pages, slept 22 seconds before scraping. 1/1 already scraped 2019-09-22 11:55:56,767 - scrapcore.scraper.selenium - INFO - Requesting the next page 2/2 keywords processed. 2019-09-22 11:56:01,961 - root - INFO - Scraping URL: https://www.mypetneedsthat.com/best-dry-dog-foods-guide/ 2019-09-22 11:56:02,681 - root - INFO - Scraping URL: https://www.businessinsider.com/best-dog-food 2019-09-22 11:56:02,686 - root - INFO - Scraping URL: https://www.akc.org/expert-advice/nutrition/best-dog-food-choosing-whats-right-for-your-dog/ 2019-09-22 11:56:02,689 - root - INFO - Scraping URL: https://www.amazon.com/Best-Sellers-Pet-Supplies-Dry-Dog-Food/zgbs/pet-supplies/2975360011 2019-09-22 11:56:02,690 - root - INFO - Scraping URL: https://www.chewy.com/b/food-332 2019-09-22 11:56:26,122 - root - INFO - Scraping URL: https://www.petco.com/shop/en/petcostore/category/dog/dog-food 2019-09-22 11:56:26,123 - root - INFO - Scraping URL: https://www.petflow.com/dog/food 2019-09-22 11:56:26,843 - root - INFO - Scraping URL: https://www.dogfoodadvisor.com/ 2019-09-22 11:56:27,735 - root - INFO - Scraping URL: https://www.petsmart.com/dog/food/dry-food/ 2019-09-22 11:56:27,737 - root - INFO - Scraping URL: https://www.petsmart.com/dog/food/ 2019-09-22 11:56:27,738 - root - INFO - Scraping URL: https://www.purina.com/dogs/dog-food 2019-09-22 11:56:28,635 - root - INFO - Scraping URL: https://www.youtube.com/watch?v=fBABfWqSN2I 2019-09-22 11:56:31,757 - root - INFO - Scraping URL: https://www.youtube.com/watch?v=7P85BMCCboI 2019-09-22 11:56:36,807 - root - INFO - Scraping URL: https://www.youtube.com/watch?v=az0ktsWYydw 2019-09-22 11:56:39,645 - root - INFO - Scraping URL: https://www.youtube.com/watch?v=njJ99wPByy4 2019-09-22 11:56:42,571 - root - INFO - Scraping URL: https://nypost.com/video/homeless-man-and-his-dog-reuniting-is-pure-joy/ 2019-09-22 11:56:45,156 - root - INFO - Scraping URL: /aclk?sa=l&ai=DChcSEwjRyYG5h-TkAhUM1WQKHSiFASYYABAAGgJwag&sig=AOD64_2IRYpCakgEzR3BK1oqeuLCVa3mjA&adurl=&rct=j&q= 2019-09-22 11:56:45,157 - root - INFO - Scraping URL: https://www.purina.com/dogs/dog-food 2019-09-22 11:56:45,867 - root - INFO - Scraping URL: https://en.wikipedia.org/wiki/Dog_food 2019-09-22 11:56:45,872 - root - INFO - Scraping URL: https://www.hillspet.com/dog-food 2019-09-22 11:56:45,876 - root - INFO - Scraping URL: https://www.smithsfoodanddrug.com/pl/dog-food/11103 2019-09-22 11:57:10,321 - root - INFO - Scraping URL: https://www.canidae.com/dog-food/ 2019-09-22 11:57:10,325 - root - INFO - Scraping URL: https://www.petcarerx.com/dog/food-nutrition 2019-09-22 11:57:11,222 - root - INFO - Scraping URL: https://www.businessinsider.com/best-dog-food 2019-09-22 11:57:11,223 - root - INFO - Scraping URL: https://www.tractorsupply.com/tsc/catalog/dog-food 2019-09-22 11:57:12,249 - root - INFO - Scraping URL: https://www.thehonestkitchen.com/dog-food 2019-09-22 11:57:12,253 - root - INFO - Scraping URL: https://www.boxed.com/products/category/418/dog-food 2019-09-22 11:57:13,171 - root - INFO - Scraping URL: https://lifesabundance.com/category/dogfood.aspx 2019-09-22 11:57:13,174 - root - INFO - Scraping URL: //www.googleadservices.com/pagead/aclk?sa=L&ai=DChcSEwj5_NHFh-TkAhWTr-wKHSgSDVMYABAAGgJwag&ohost=www.google.com&cid=CAASEuRoai4G0R8MNbToVnZKzozmNA&sig=AOD64_10tA_ESFCwAHTPgPUTDsInBgYwEQ&adurl=&rct=j&q= 2019-09-22 11:57:13,178 - root - INFO - Scraping URL: https://freshpet.com/why-freshpet/ 2019-09-22 11:57:13,901 - root - INFO - Scraping URL: https://pet-food.thecomparizone.com/?var1=82002114870&var2=381760664839&var4&var5=b&var7=1234567890&utm_source=google&utm_medium=cpc None Traceback (most recent call last): File "C:\Users\rot\Anaconda3\lib\site-packages\serpscrap\csv_writer.py", line 14, in write w.writerow(row) File "C:\Users\rot\Anaconda3\lib\csv.py", line 155, in writerow return self.writer.writerow(self._dict_to_list(rowdict)) File "C:\Users\rot\Anaconda3\lib\csv.py", line 151, in _dict_to_list + ", ".join([repr(x) for x in wrong_fields])) ValueError: dict contains fields not in fieldnames: 'url', 'encoding', 'meta_robots', 'meta_title', 'text_raw', 'last_modified', 'status' --------------------------------------------------------------------------- ValueError Traceback (most recent call last) ~\Anaconda3\lib\site-packages\serpscrap\csv_writer.py in write(self, file_name, my_dict) 13 for row in my_dict[0:]: ---> 14 w.writerow(row) 15 except Exception: ~\Anaconda3\lib\csv.py in writerow(self, rowdict) 154 def writerow(self, rowdict): --> 155 return self.writer.writerow(self._dict_to_list(rowdict)) 156 ~\Anaconda3\lib\csv.py in _dict_to_list(self, rowdict) 150 raise ValueError("dict contains fields not in fieldnames: " --> 151 + ", ".join([repr(x) for x in wrong_fields])) 152 return (rowdict.get(key, self.restval) for key in self.fieldnames) ValueError: dict contains fields not in fieldnames: 'url', 'encoding', 'meta_robots', 'meta_title', 'text_raw', 'last_modified', 'status' During handling of the above exception, another exception occurred: Exception Traceback (most recent call last) <ipython-input-16-3f66e8511348> in <module> 8 scrap = serpscrap.SerpScrap() 9 scrap.init(config=config.get(), keywords=keywords) ---> 10 scrap.as_csv('/tmp/output') ~\Anaconda3\lib\site-packages\serpscrap\serpscrap.py in as_csv(self, file_path) 146 writer = CsvWriter() 147 self.results = self.run() --> 148 writer.write(file_path + '.csv', self.results) 149 150 def scrap_serps(self): ~\Anaconda3\lib\site-packages\serpscrap\csv_writer.py in write(self, file_name, my_dict) 15 except Exception: 16 print(traceback.print_exc()) ---> 17 raise Exception Exception:
Many thanks !!
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hi.
Does somebody have any idea what could be the reason that on some keywords i get the data while on others i don't ?
for example, dog food:
Many thanks !!
The text was updated successfully, but these errors were encountered: