Skip to content

C++ implementation of Alessandro Moschitti's Tree Kernel algorithm, from "Making Tree Kernels Practical for Natural Language Learning"

License

Notifications You must be signed in to change notification settings

JDonner/TreeKernel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

A C++ implementation of tree kernels per Alessandro Moschitti's 2006 paper, "Making Tree Kernels Practical for Natural Language Learning"

This consists of two directories, tree-parser and tree-kernel itself. tree-parser parses Penn Treebank forms of sentences as produced by the Standford Parser, (example) among others. Eg:

 (ROOT
   (S
     (NP (PRP$ My) (NN dog))
     (ADVP (RB also))
     (VP (VBZ likes)
       (S
         (VP (VBG eating)
           (NP (NN sausage)))))
     (. .)))

This TreeKernel source is MIT licensed (see LICENSE), and commercially useable. If you need a different one just ask.

Though one person has used this without an SVM, I intended to integrate this into an SVM library, though I never got around to it. The two libraries I had in mind were: libSVM - github, home and: Dlib - home

Feel free to integrate it yourself, I'd love to hear about it if you do.

Building

This is GNU Makefile, and GCC based, but nothing is OS-dependent so it should be easily tweakable for other systems.

cd tree-parser
make

cd ../tree-kernel
make

TODO

There are several optimizations possible - templating out sigma, making NodePairsDeltaTable an unordered_map, but most of all, storing the Nodes in Sentence in a vector and using indices as pointers to them (to save memory and cache). C bindings would be nice, too.

If you make any improvements I'll be happy to take them. Of course, you can just do what you want with it.

About

C++ implementation of Alessandro Moschitti's Tree Kernel algorithm, from "Making Tree Kernels Practical for Natural Language Learning"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages