Skip to content

kutkopy/versteisch-bahnhof

Repository files navigation

versteisch-bahnhof

Find the full description of the hands-on task here: https://tiny.cc/versteisch-bahnhof

versteisch-bahnhof is a Swiss German dialect predictor using TF-IDF vector representations and a Random Forest classifier.

The evaluation is based on a publicly available Swiss German kaggle competition. This dataset is based on four different dialects:

BE Bernese
LU Lucerne
ZH Zurich
BS Basel

Whereby the training set consists of 15573 example sentences, wheres as the test set consists of 2499 example sentences.

Requirements

Python3 is required.

First, install pipenv using pip:

pip install --user pipenv

Installation

To load all dependencies into an own virtual environment:

pipenv install

Next, you can import the created virtual environment into your preferred IDE and activate it in your shell:

pipenv shell

Usage

You can train the model either by train_dialect (fixed parameter setting) or train_dialect_hyperparameter (grid search over different parameter settings). In both cases, the best parameters are logged to the console.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published