-
Notifications
You must be signed in to change notification settings - Fork 756
Repository synchronization
While by itself OpenGrok does not provide a way how to synchronize repositories it is shipped with Python script that makes it easy to synchronize.
The script synchronizes the repositories of projects by running appropriate commands (e.g. git pull
for Git). While it can run perfectly fine standalone, it is meant to be run from within opengrok-sync
(see above).
The script accepts the configuration either in JSON or YAML.
The script assumes that OpenGrok is setup with projects (i.e. use the -P
indexer option).
When run in batch mode, the script logs the output to a file for each project. It rotates the logs.
It can be used within the opengrok-sync
script - see https://github.com/OpenGrok/OpenGrok/wiki/Per-project-management-and-workflow for more details.
The configuration file contents in YML can look e.g. like this:
#
# Commands (or paths - for specific repository types only)
#
commands:
hg: /usr/bin/hg
svn: /usr/bin/svn
teamware: /ontools/onnv-tools-i386/teamware/bin
#
# The proxy environment variables will be set for a project's repositories
# if the 'proxy' property is True.
#
proxy:
http_proxy: proxy.example.com:80
https_proxy: proxy.example.com:80
ftp_proxy: proxy.example.com:80
no_proxy: example.com,foo.example.com
hookdir: /tmp/hooks
# per-project hooks relative to 'hookdir' above
logdir: /tmp/logs
command_timeout: 300
hook_timeout: 1200
# as if opengrok-mirror was run with -I
incoming_check: true
#
# Per project configuration.
#
projects:
http:
proxy: true
opengrok-stable:
disabled: true
foo:
# override the incoming check for this project
incoming_check: false
userland:
proxy: true
hook_timeout: 3600
hooks:
pre: userland-pre.ksh
post: userland-post.ksh
opengrok-master:
ignored_repos:
- testdata/repositories/*
jdk.*:
proxy: true
hooks:
post: jdk_post.sh
dpdk-next-net:
strip_outgoing: true
special:
ignore: true
In the above config, the userland
project will be run with environment variables in the proxy
section, plus it will also run scripts specified in the hook
section before and after all its repositories are synchronized. The hook scripts will be run with the current working directory set to that of the project.
The opengrok-master
project contains a RCS repository that would make the mirroring fail (since opengrok-mirror
does not support RCS yet) so it is marked as ignored.
Repository commands use extended syntax, generally there are two commands utilized by the tools:
- incoming check
- repository synchronization
The tools internally utilizes the necessary logic to perform these tasks, using the basic repository commands. It is possible to override the repository commands with:
commands:
git: /usr/local/bin/git
hg: /usr/bin/hg
svn: /usr/bin/svn
# Note: unlike other repository types, Teamware needs a path to the binaries, i.e. directory.
teamware: /ontools/onnv-tools-i386/teamware/bin
When this basic configuration is not enough for you, it is possible to override the logic by providing custom command for each task:
commands:
git:
incoming: ['/bin/echo', 'some new changes!']
sync: ['git', 'pull']
If you override only one of the commands, the tools will use the default internal logic to perform the other command. For a special case when you want to override one of the commands while using the default routine for the other with different repository command, use following syntax:
commands:
git:
# override repository command
command: /my/custom/git
# override incoming check with custom command (/my/custom/git is not called for incoming check)
incoming: ['/bin/echo', 'some new changes!']
The command is run in the repository directory as the cwd and is expected to return:
- 0 - for successful synchronization
- non-zero status - for failed synchronization (with possible error output)
The command is run in the repository directory as the cwd and is expected to return:
- 0 - for successful incoming check and
- empty stdout for no incoming changes
- non-empty stdout for incoming changes
- non-zero status - for failed incoming check (with possible error output)
Just like opengrok-sync
, opengrok-mirror
also queries the web app for various properties, so if the web application is not listening on default host/port, the URI location has to be specified using the -U option.
Multiple projects can share the same configuration using regular expressions as demonstrated with the jdk.*
pattern in the above configuration. The patterns are matched from top to the bottom of the configuration file, first match wins.
The opengrok-stable
project is marked as disabled. This means that the opengrok-mirror
script will exit with special value of 2 that is interpreted by the opengrok-sync
script to avoid any reindex. It is not treated as an error.
Some repositories under the project are not meant to be synchronized (e. g. the remote does not exist anymore or it is a testing repository for tests in that project). opengrok-mirror
can ignore them if you provide them in the ignored_repos
list. This is a list of paths relative to the matched project (see project-matching) and supports filename glob expansion (see the example).
opengrok-mirror
returns distinct codes that are interpreted by opengrok-sync
. When a repository fails to sync, e.g. because there are uncommitted changes, opengrok-mirror
returns 1 that signifies an error and opengrok-sync
terminates the execution. To make it always return 0, the ignore_errors
configuration property can be set both per project and on global configuration level. This setting is handy when using opengrok-sync
with a project under development where uncommitted files are common occurrence.
Sometimes, running opengrok-mirror
on a project is undesirable. For that, set the project propery ignore
to true
and the opengrok-mirror
will skip it and return success.
In batch mode, log messages will be written to a log file under the logdir
directory specified in the configuration and rotated for each run, up to default count (8) or count specified using the --backupcount
option.
If pre and post mirroring hooks are specified, they are run before and after project synchronization. If any of the hooks fail, the program is immediately terminated. However, if the synchronization (that is run in between the hook scripts) fails, the post hook will be executed anyway. This is done so that the project is in sane state - usually the post hook which is used to apply extract source archives and apply patches. If the pre hook is used to clean up the extracted work and project synchronization failed, the project would be left barebone.
Both repository synchronization commands and hooks can have a timeout. By default there is no timeout, unless specified in the configuration file. There are global and per project timeouts, the latter overriding the former. For instance, in the above configuration file, the userland
project overrides global hook timeout to 1 hour while inheriting the command timeout.
The opengrok-mirror
can be run with the -I
option to perform a check whether there are any incoming changes from the parent repository.
If there are no incoming changes, opengrok-mirror
exits with return code of 2. This code is interpreted by the opengrok-sync
program in a special way - it will skip subsequent processing for given project, avoiding running the indexer unnecessarily.
The incoming_check
configuration property can be used to override. It can be set on global and per project level.
There is a special case: if the project being mirrored has not been indexed yet, the incoming check will be overridden. This is useful when adding a new project and running opengrok-sync
that has opengrok-mirror -I
in the configuration.
The opengrok-mirror
can be run with the --strip-outgoing
option to check whether there are any outgoing changesets in repositories of given project(s) and strip these before synchronization of the repositories. If such changes are found and stripped, the project data (not source code) will be deleted so that the project can be reindexed from scratch.
This is handy when performing synchronization of a repository that often rewrites the history.
The strip_outgoing
configuration property can be used to override. It can be set on global and per project level.
It is possible to configure a command to be called/executed for disabled projects. Like with opengrok-sync
this supports both RESTful API calls as well as command execution. This allows for instance to tag the disabled projects with Messages so they are annotated in the UI (set the duration to be less than mirroring/syncing period to avoid duplicating messages).
The disabled command is configured globally and will vary based on project thanks to pattern substitution/append.
Any failures in disabled command processing are logged and do not change the overall result of the mirroring command.
Command examples:
disabled_command:
call:
uri: '%URL%/api/v1/messages'
method: POST
data:
messageLevel: warning
duration: PT1H
tags: ['%PROJECT%']
text: resync + reindex in progress
projects:
foo:
disabled: true
bar:
disabled: true
disabled-reason: "bar is not active anymore"
With the above config, a Message will be sent to the OpenGrok web application that will in turn be visible in the user interface for particular project. For project foo
, simple disabled project
message will appear. For the bar
project, message disabled project: bar is not active anymore
message will appear. For more documentation on RESTful API for the web application see https://github.com/oracle/opengrok/wiki/Web-services
disabled_command:
command: [cat]
projects:
foo:
disabled: true