This brief document explains how to deploy the Grimoire Open Development Analytics platform for The Document Foundation. These are the main steps included in the following sections:
- Installation of the docker container
- Data retrieval
- Data enrichment
- Publication on a Kibana dashboard customized by Bitergia
- The host machine should be a GNU/Linux box with at least: git, ssh, docker(>=1.5) and docker-compose (>= 1.5)
- You must have a gerrit account in gerrit.libreoffice.org with your ssh public key added to it.
- 50 GB hard disk free space for git repositories.
- Chrome/Chromium is the web browser recommended for performance.
The installation of the docker container is pretty straightforward:
- Install docker and docker-compose in your host machine
- Create a user with login gelk and add it to the docker group
- Add gelk user to docker group
root@dellx:~# adduser --disabled-password gelk root@dellx:~# adduser gelk docker
- Login as gelk user
gelk@dellx:~$ id uid=1002(gelk) gid=1002(gelk) grupos=1002(gelk),132(docker)
- Create a directory "/home/gelk/devel"
gelk@dellx:~$ mkdir ~/devel
- Clone GrimoireELK
gelk@dellx:~$ git clone https://github.com/grimoirelab/GrimoireELK.git devel/GrimoireELK
- Create ElasticSearch data dir for persistence::
gelk@dellx:~$ mkdir -p ~/Docker/data/elasticsearch/
- Start BiDEK docker compose:
gelk@dellx:~$ DATA_DOCKER=~/Docker/data docker-compose -f devel/GrimoireELK/docker/compose/bidek.yml up Starting compose_redis_1 Starting compose_elasticsearch_1 Starting compose_kibiter_1 Starting compose_kibiter-edit_1 Starting compose_mariadbdata_1 Starting compose_mariadb_1 Starting compose_gelk_1
All the logs from all the containers will be printed in this console, so open a new one to start working with the platform.
Enter the docker gelk container:
gelk@dellx:~$ docker exec -t -i compose_gelk_1 env TERM=xterm /bin/bash
Now, we will start with the data retrieval. One tip before going on, for testing purpose you can add the "--from-date" paramater to the command "p2o.py", it will allow you to download just a slice of the data. For instance, from January 1st 2016 you should include: --from-date "2016-01-01"
This process takes around 11 hours to finish. It is recommended to execute it inside a screen/tmux.
bitergia@a27769af72e3:~$ cd GrimoireELK/utils/
bitergia@89a865f18523:~/GrimoireELK/utils$ ./p2o.py -e http://elasticsearch:9200 -g bugzilla https://bugs.documentfoundation.org > ~/TDF-bugzilla.log 2>&1
To test the process, you can get the bugs since March 2016:
bitergia@a27769af72e3:~/GrimoireELK/utils$ ./p2o.py -e http://elasticsearch:9200 -g bugzilla https://bugs.documentfoundation.org --from-date "2016-03-01"
Once the process is finished you can check the number of bugs downloaded in the log file or using the Elastic Search database with the following command:
bitergia@a27769af72e3:~/GrimoireELK/utils$ curl -s http://elasticsearch:9200/bugzilla_https:__bugs.documentfoundation.org/_search | python -m json.tool | grep '"total": '
First you need an account for the code review plataform at gerrit.libreoffice.org. Ensure your account has a public SSH key and the typical user name. Before copying and pasting the content below, replace GERRIT_USER with your user name.
Next step is to copy you SSH public key to the docker instance. In our case, we have it in the host machine (which is 172.17.42.1) for the account "acs".
bitergia@a27769af72e3:~$ scp -r [email protected]:.ssh ~/ bitergia@a27769af72e3:~$ eval `ssh-agent -s` bitergia@a27769af72e3:~$ ssh-add ~/.ssh/id_rsa
At this point, your public key is both copied to the container and set up in gerrit.libreoffice.org.
In order to check that the gerrit user config is working, execute the command (remember to replace GERRIT_USER with your username):
bitergia@a27769af72e3:~/GrimoireELK/utils$ ssh -p 29418 [email protected] gerrit version gerrit version 2.11.7
Time to start retrieving code reviews. Use the commands below (again, with you gerrit account):
bitergia@89a865f18523:~/GrimoireELK/utils$ ./p2o.py -e http://elasticsearch:9200 -g gerrit --user GERRIT_USER --url gerrit.libreoffice.org > ~/TDF-gerrit-all.log 2>&1
Once the process finishes you can check the number of reviews downloaded in the log file and again using the Elastic Search database with this command:
bitergia@a27769af72e3:~/GrimoireELK/utils$ curl -s http://elasticsearch:9200/gerrit_gerrit.libreoffice.org/_search | python -m json.tool | grep '"total": '
You will need your gerrit user account configured as explained in the previous Gerrit section. Don't underestimate the space, data will need around 50 GB of free space to clone all git repositories to be analyzed.
Start the Git data retrieval with the following command (remeber to replace the GERRIT_USER with your gerrit user name):
bitergia@89a865f18523:~/GrimoireELK/utils$ ssh -p 29418 [email protected] gerrit ls-projects | awk '{print "./p2o.py -e http://elasticsearch:9200 --index git_TDF -g git git://gerrit.libreoffice.org/"$1}' | sh > ~/TDF-git-all.log 2>&1
Tip: For testing purposes, you may want to get only the commits produced in 2016 with a command like this one:
ssh -p 29418 [email protected] gerrit ls-projects | awk '{print "./p2o.py -e http://elasticsearch:9200 --index git_TDF -g git git://gerrit.libreoffice.org/"$1" --from-date \"2016-01-01\""}'
Once the process finish you can check the number of bugs downloaded in the log file and also from Elastic Search, with the command:
bitergia@a27769af72e3:~/GrimoireELK/utils$ curl -s http://elasticsearch:9200/git_tdf/_search | python -m json.tool | grep '"total": '
The update process of the data sources uses the same commands shown above. Our recommendation is to change the output log files so you can analyze them later. An initial approach to have periodic updates is to use linux crontab to schedule them. Next versions of the product will include an scheduler.
Before showing the data on the Kibana dashboards, the raw index created above should be enriched.
The commands to enrich bugzilla and gerrit are the same but with the option "--enrich_only". For git the command is also the same, but only applied to the global "git_TDF" index.
bitergia@89a865f18523:~/GrimoireELK/utils$ ./p2o.py -e http://elasticsearch:9200 -g --enrich_only bugzilla https://bugs.documentfoundation.org > ~/TDF-bugzilla-enrich.log 2>&1 bitergia@89a865f18523:~/GrimoireELK/utils$ ./p2o.py -e http://elasticsearch:9200 -g --enrich_only gerrit --user GERRIT_USER --url gerrit.libreoffice.org > ~/TDF-gerrit-all-enrich.log 2>&1 bitergia@89a865f18523:~/GrimoireELK/utils$ ./p2o.py -e http://elasticsearch:9200 -g --enrich_only --index git_TDF git '' > ~/TDF-git-all-enrich.log 2>&1
Each data source has its own dashboard template, in order to import them into Kibana, execute the following commands:
bitergia@a27769af72e3:~/GrimoireELK/utils$ ./kidash.py -e http://elasticsearch:9200 --import ../dashboards/git-activity.json bitergia@a27769af72e3:~/GrimoireELK/utils$ ./kidash.py -e http://elasticsearch:9200 --import ../dashboards/gerrit-activity.json bitergia@a27769af72e3:~/GrimoireELK/utils$ ./kidash.py -e http://elasticsearch:9200 --import ../dashboards/bugzilla-testing.json bitergia@a27769af72e3:~/GrimoireELK/utils$ ./kidash.py -e http://elasticsearch:9200 --list http://elasticsearch:9200/.kibana/dashboard/_search?size=10000 Git-Activity Gerrit-Activity BugsMozilla
We are getting closer. Now we'll use "e2k.py" to let Kibana knows about the data:
bitergia@a27769af72e3:~/GrimoireELK/utils$ ./e2k.py -e http://elasticsearch:9200 -i git_tdf_enrich -d Git-Activity
bitergia@a27769af72e3:~/GrimoireELK/utils$ ./e2k.py -e http://elasticsearch:9200 -i gerrit_gerrit.libreoffice.org_enrich -d Gerrit-Activity
bitergia@a27769af72e3:~/GrimoireELK/utils$ ./e2k.py -e http://elasticsearch:9200 -i bugzilla_https:__bugs.documentfoundation.org_enrich -d BugsMozilla
Now the information is almost ready. We only need to set up a default index for Kibana.
- Visit this link: http://localhost:5601/app/kibana#/dashboard/BugsMozilla__bugzilla_https:__bugs.documentfoundation.org_enrich
- Click on the left on the Index named gerrit_gerrit.libreoffice.org_enrich (you could use any of the three we just created)
- Click on the green star to make it the default index
- And .. now we are ready to start playing with Kibana.
Tip: The first time you see the dashboard, pay attention to the time frame displayed, sometimes it is set up to show the last 15 minutes and you may not have anything to be shown.
Enjoy them!:
- http://localhost:5601/app/kibana#/dashboard/BugsMozilla__bugzilla_https:__bugs.documentfoundation.org_enrich
- http://localhost:5601/app/kibana#/dashboard/Gerrit-Activity__gerrit_gerrit.libreoffice.org_enrich
- http://localhost:5601/app/kibana#/dashboard/Git-Activity__git_tdf_enrich
There is one way to have the different dashboards linked from a Kibana instance, if you want to get that you'll have to execute the commands below which add some links:
bitergia@a27769af72e3:~/GrimoireELK/utils$
curl -XPOST "http://elasticsearch:9200/.kibana/metadashboard/main" -d'
{
"Bugzilla":"BugsMozilla__bugzilla_https:__bugs.documentfoundation.org_enrich",
"Git":"Git-Activity__git_tdf_enrich",
"Gerrit":"Gerrit-Activity__gerrit_gerrit.libreoffice.org_enrich"
}'
and now visit http://localhost:5602/ and click on the links above.