Course Crawler

This crawl program successfully aggregated Columbia course and related information as of Sunday, February 20, 2011. Columbia's data formats are subject to change, and I can not guarantee that this program will be compatible with future formats.

Questions, comments, and concerns should be directed at: Ryan Bubinski. ryanbubinski gmail com.

Binary Dependencies

Ruby 1.8.7>=
MySQL 5.0>=

Installation and Setup

Before beginning, make sure you have Ruby 1.8.7 or later and MySQL 5.0 or later installed.

Copy config.yaml.default to config.yaml
Complete the fields in config.yaml
run gem install bundler
run bundle install

Crawling

Once you've set up the application, run the app.rb file in the root directory to begin the crawling process.

HTTP requests are made in parallel using the Typhoeus gem.

Exporting data

Data is stored in a local database, which can be exported to a text file in SQL format using the command:

rake db:export

The result is stored in the local directory in a file named "data.sql"

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
fixtures		fixtures
models		models
.gitignore		.gitignore
.rvmrc		.rvmrc
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
README.markdown		README.markdown
Rakefile		Rakefile
app.rb		app.rb
config.yaml.default		config.yaml.default

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Course Crawler

Binary Dependencies

Installation and Setup

Crawling

Exporting data

About

Releases

Packages

adi-archive/Course-Crawler

Folders and files

Latest commit

History

Repository files navigation

Course Crawler

Binary Dependencies

Installation and Setup

Crawling

Exporting data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages