Word Similarity Search (Java 17)

Word similarity search created with Java 17 to allow users to rank words or phrases based on a 50d vector dataset. This project allows users to provide a 50d vector dataset of word embeddings and a word/phrase they would like to parse. From there, the program provides the highest ranked words based on similarity using an approach of the user's choosing.

State-of-the-Art Features

Specify your own 50d dataset that will be used for similarity searching
Parse a single word or an entire sentence, and the system will try to provide the best result for you
Save your findings in your own specified output file and share it with your friends

The available algorithms include:

Dot Product
Euclidean Distance
Cosine Distance

How To Run

Ensure that you are have Java SDK 17 or higher installed

First compile the src/ directory using the following command

javac src/ie/atu/sw/*.java -d out/

Then run the program with
cd out/

java ie.atu.sw.Runner

You will then be presented with options as follows:

Provide file path for 50d word embeddings dataset: Chose the 50d .txt file to act as your model
Print total count of words in model: View total amount of words in model
Provide file path for output: Specify the output file for your results
Cycle similarity search algorithm: Cycle through the available algorithms mentioned above
Change number of words to show in similarity ranking: Change the number of words shown in the output file
Enable/Disable weight details (false): Toggle word similarity score visibility
Begin word similarity search: Input word sequence (e.g. apple banana cheese) and start word similarity search
Quit: Exit the application

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.idea		.idea
src/ie/atu/sw		src/ie/atu/sw
.gitignore		.gitignore
README.md		README.md
word-embeddings.txt		word-embeddings.txt
word-similarity-search.iml		word-similarity-search.iml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Word Similarity Search (Java 17)

State-of-the-Art Features

How To Run

About

Releases

Packages

Languages

Joecey/word-similarity-search

Folders and files

Latest commit

History

Repository files navigation

Word Similarity Search (Java 17)

State-of-the-Art Features

How To Run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages