Tripadvisor Crawler

This is a placeholder, later the data retrieved by the crawler will be presented here. The data will be published after anonymisation and aggregation.

The crawler started multiple headless chrome instances over puppeteer to render the Tripadvisor website. Then puppeteer extracted the interesting data and the crawler packaged them in either a restaurant, review, or user object. This object then got handed over to the database handler which was implemented with mongoose. The database handler then wrote the objects into MongoDB.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
data		data
database		database
images		images
restaurants		restaurants
reviews		reviews
users		users
utils		utils
.DS_Store		.DS_Store
.dockerignore		.dockerignore
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
_config.yml		_config.yml
babel.config.json		babel.config.json
cities.js		cities.js
cityRestaurants.js		cityRestaurants.js
crawlRestaurantsInCities.js		crawlRestaurantsInCities.js
deleteWrongs.js		deleteWrongs.js
docker-compose.yml		docker-compose.yml
index.js		index.js
install_dependencies.sh		install_dependencies.sh
package.json		package.json
restaurantsListUrls.js		restaurantsListUrls.js
startBrowserless.sh		startBrowserless.sh
tourismUrls.js		tourismUrls.js
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tripadvisor Crawler

About

Releases

Packages

Contributors 2

Languages

paulsp94/TripadvisorCrawler

Folders and files

Latest commit

History

Repository files navigation

Tripadvisor Crawler

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages