Mark V. Shaney

Kotlin implementation of the classic Mark V. Shaney. The model learns, based on pairs of words (prefix), which words usually follows the pair (suffix).

For example the sentence "Mark V. Shaney wrote on the Usenet." results in

Prefix	Suffix
Mark V.	Shaney
V. Shaney	wrote
Shaney wrote	on
wrote on	the
on the	Usenet.

To generate text, a random word is chosen and then, suffixes learned are appended. When the text is long enough, there are several suffixes for every prefix, so that there is diversity when constructing a sentence.

The table and sentence construction is similar to a Markov Chain (hence the name 🙃).

The program is an example of an early and extremely simple language model, yet similar to Yoshua Bengio et al.'s Neural model and current transformer LLMs such as GPT.

Running

Install the Kotlin command-line compiler
Compile kotlinc shaney.kt -include-runtime -d shaney.jar
Run java -jar shaney.jar 100 chaos.txt

Note: Replace 100 with another length for the generated text and chaos.txt with another file to learn Markov chains.

Example

science was heading for a crisis of increasing specialization,” a Navy official in charge of research money for experiments, has often been called a strange attractor.

Credits

The text from chaos.txt is from James Gleick's excellent Chaos: Making a New Science

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
chaos.txt		chaos.txt
shaney.kt		shaney.kt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mark V. Shaney

Running

Example

Credits

About

Languages

macemoth/shaney

Folders and files

Latest commit

History

Repository files navigation

Mark V. Shaney

Running

Example

Credits

About

Resources

Stars

Watchers

Forks

Languages