Skip to content

Generating horoscopes with a dataset pulled from NYPost

Notifications You must be signed in to change notification settings

dsnam/markovscope

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Markov Horoscopes

An attempt to generate convincingly fake horoscopes (aka horoscopes) using Markov Chains and a bunch of data from the NY Post. Data is about 3 years of horoscopes for each sign and can be found in the csv file.

A bit about how these are generated: each "state" in the chain consists of two words. Longer sequences lead to better sounding results but you also risk just reprinting what was in the data. The distributions of words that follow are built on a per sign basis, as I think there are certain planets and things associated with each sign. The next word at each step is chosen uniformly, but duplicates are allowed in the lists of available words, which weights words appropriately according to how they are distributed in the data.

If you want to generate a horoscope, make sure you have the states in your data folder and markovscope.py and run: python markovscope.py <sign>

Results are generally pretty nonsensical, but some of the generated text does have a horoscope tone to it. Here's an example that was actually the first one generated.

python markovscope.py taurus What happens over the next seven days is a very small risk. Before you proceed, ask yourself if you have noticed the difference. Friends and family will never happen. Remember too that quality is more to life in remarkable ways, so stop dithering and start thinking that you see it.

RNN

Since Markov Chains didn't fly so good I figured training an RNN on the same dataset might produce something interesting.

I preprocessed the data by creating sequences out of each horoscope and inserting sentence start and end tags. Then each word gets mapped to an integer and divided by the total number of words to produce sequences of small valued inputs to feed into the network. Currently it'll produce a decent sentence or two but then just repeat itself.

About

Generating horoscopes with a dataset pulled from NYPost

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages