This git repository contains (almost) all of the code samples available on http://rosettacode.org organized by Language and Task.
All of the data is in this repository, so you can just run:
git clone https://github.com/acmeism/RosettaCodeData
However...
It's a lot of data!
If you just want the latest data, the quickest thing to do is:
git clone https://github.com/acmeism/RosettaCodeData --single-branch --depth=1
This repository's data content is created by a Perl program called
rosettacode
.
You can install it with this command:
cpanm RosettaCode
You can rebuild the data with:
make build
This repository has a bin
directory with various tools for working with the
data.
-
rcd-api-list-all-langs
List all the programming language names directly from rosettacode.org
-
rcd-api-list-all-tasks
List all the programming task names directly from rosettacode.org
-
rcd-new-langs
List the RosettaCode languages not yet add to Conf
-
rcd-new-tasks
List the RosettaCode tasks not yet add to Conf
-
rcd-samples-per-lang
Show the number of code samples per language
-
rcd-samples-per-task
Show the number of code samples per task
-
rcd-tasks-per-lang
Show the number of tasks with code samples per language
-
rcd-langs-per-task
Show the number of languages with code samples per task
Pull requests welcome!
This project is not a perfect representation of RosettaCode yet. It has a few uncicode issues. It also has to deal with various formatting mistakes in the mediawiki source pages.
-
Fix bugs
-
Correct the 100s of guessed file extensions in
Conf/lang.yaml
-
Ability to only fetch cache pages since last pushed data update
-
Support names with non-ascii characters
-
Add more bin tools
-
Address errors reported in rosettacode.log after running
make build