Installation

klinker

Installation

Clone the repo and change into the directory:

git clone https://github.com/dobraczka/klinker.git
cd klinker

For usage with GPU create a micromamba environment:

micromamba env create -n klinker-conda --file=klinker-conda.yaml

Activate it and install the remaining dependencies:

mamba activate klinker-conda
pip install -e .

Alternatively if you don't intend to utilize a GPU you can install it in a virtual environment:

python -m venv klinker-env
source klinker-env/bin/activate
pip install -e .[all]

or via poetry:

poetry install

Usage

Load a dataset:

from sylloge import MovieGraphBenchmark
from klinker.data import KlinkerDataset

ds = KlinkerDataset.from_sylloge(MovieGraphBenchmark(graph_pair="tmdb-tvdb"))

Create blocks and write to parquet:

from klinker.blockers import SimpleRelationalTokenBlocker

blocker = SimpleRelationalTokenBlocker()
blocks = blocker.assign(left=ds.left, right=ds.right, left_rel=ds.left_rel, right_rel=ds.right_rel)
blocks.to_parquet("tmdb-tvdb-tokenblocked")

Read blocks from parquet and evaluate:

from klinker import KlinkerBlockManager
from klinker.eval_metrics import Evaluation

kbm = KlinkerBlockManager.read_parqet("tmdb-tvdb-tokenblocked")
ev = Evaluation.from_dataset(blocks=kbm, dataset=ds)

Reproduce Experiments

The experiment.py has commands for datasets and blockers. You can use python experiment.py --help to show the available commands. Subcommands can also offer help e.g. python experiment.py gcn-blocker --help.

You have to use a dataset command before a blocker command.

For example if you used micromamba for installation:

micromamba run -n klinker-conda python experiment.py movie-graph-benchmark-dataset --graph-pair "tmdb-tvdb" relational-token-blocker

This would be similar to the steps described in the above usage section.

In order to precisely reproduce the results from the paper we provide (adapted) run scripts from our SLURM batch scripts in the run_scripts folder. Please consult the run_scripts/README.md for further information. For archival purposes the experiment artifacts and the source code are stored in Zenodo.

Name		Name	Last commit message	Last commit date
Latest commit History 267 Commits
.github/workflows		.github/workflows
docs		docs
run_scripts		run_scripts
src/klinker		src/klinker
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
encoder_experiment.py		encoder_experiment.py
experiment.py		experiment.py
klinker-conda.yaml		klinker-conda.yaml
later_eval.py		later_eval.py
mkdocs.yml		mkdocs.yml
noxfile.py		noxfile.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
spark_eval.py		spark_eval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

klinker

Installation

Usage

Reproduce Experiments

About

Releases

Languages

License

dobraczka/klinker

Folders and files

Latest commit

History

Repository files navigation

klinker

Installation

Usage

Reproduce Experiments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Languages