Skip to content

kota7/mecabwrap-py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mecabwrap

https://travis-ci.org/kota7/mecabwrap-py.svg?branch=master https://ci.appveyor.com/api/projects/status/oidn1rfte6u8kavs/branch/master?svg=true

mecabwrap is yet another Python interface to MeCab Morphological Analyzer.

Its goal is to provide intuitive APIs that work on Unix and Windows machines seamlessly.

Requirement

  • Python 2.7+ or 3.4+ (May also work on older versions)
  • MeCab 0.996

Installation

1. Install MeCab

Ubuntu

$ sudo apt-get install mecab libmecab-dev mecab-ipadic-utf8

Mac OSX

$ brew install mecab mecab-ipadic

Windows

Download and run the installer.

See also: official website

2. Install this Package

Install from PyPI

$ pip install mecabwrap

or, from GitHub

$ git clone --depth 1 https://github.com/kota7/mecabwrap-py.git
$ cd mecabwrap-py
$ pip install -U .

Quick Check

Following command will print the MeCab version. Otherwise, you do not have MeCab installed or MeCab is not on the search path.

$ mecab -v
# should print `mecab of 0.996` or similar.

To verify that the package is successfully installed, try the following:

$ python
>>> from mecabwrap import tokenize, print_token
>>> for token in tokenize(u"すもももももももものうち"):
...     print_token(token)
...
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
   助詞,係助詞,*,*,*,*,,,
もも  名詞,一般,*,*,*,*,もも,モモ,モモ
   助詞,係助詞,*,*,*,*,,,
もも  名詞,一般,*,*,*,*,もも,モモ,モモ
   助詞,連体化,*,*,*,*,,,
うち  名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ

Usage

See the example notebook (or a cleaner version on nbviewer) for more detail.