Skip to content

Simple code for trying out TensorFlow with simulated datasets

License

Notifications You must be signed in to change notification settings

LuizDevitte/try-tf

 
 

Repository files navigation

Try TensorFlow

Example code to try out TensorFlow. See the blog post Simple end-to-end TensorFlow examples for more discussion and context.

You need have TensorFlow installed.

Instructions for simulated data

The subdirectory try-tf/simdata contains train and evaluation data sets for three simulated data set types: linear, moon, and saturn. It also contains some simple R and Python scripts for generating and viewing the data.

Linearly separable data

The data:

  • try-tf/simdata/linear_data_train.csv
  • try-tf/simdata/linear_data_eval.csv

The training data set looks like this.

Softmax regression is perfectly capable of handling this data. If you run the command below, you should see output similar to that provided here.

$ python softmax.py --train simdata/linear_data_train.csv --test simdata/linear_data_eval.csv --num_epochs 5 --verbose True
Initialized!

Training.
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47 48 49

Weight matrix.
[[-1.87038445  1.87038457]
 [-2.23716712  2.23716712]]

Bias vector.
[ 1.57296884 -1.57296848]

Applying model to first test instance.
Point = [[ 0.14756215  0.24351828]]
Wx+b =  [[ 0.7521798  -0.75217938]]
softmax(Wx+b) =  [[ 0.81822371  0.18177626]]

Accuracy: 1.0

The plot of the decision boundary:

Moon data

The data:

  • try-tf/simdata/moon_data_train.csv
  • try-tf/simdata/moon_data_eval.csv

The training data set looks like this.

The softmax network performs poorly, but a network with a five node hidden layer works great.

$ python softmax.py --train simdata/moon_data_train.csv --test simdata/moon_data_eval.csv --num_epochs 100
Accuracy: 0.861

$ python hidden.py --train simdata/moon_data_train.csv --test simdata/moon_data_eval.csv --num_epochs 100 --num_hidden 5
Accuracy: 0.971

The plot of the decision boundaries produced by the above calls:

Saturn data

The data:

  • try-tf/simdata/saturn_data_train.csv
  • try-tf/simdata/saturn_data_eval.csv

The training data set looks like this.

Again, a softmax network performs poorly, but a network with a five node hidden layer works great.

$ python softmax.py --train simdata/saturn_data_train.csv --test simdata/saturn_data_eval.csv --num_epochs 100
Accuracy: 0.43

$ python hidden.py --train simdata/saturn_data_train.csv --test simdata/saturn_data_eval.csv --num_epochs 100 --num_hidden 15
Accuracy: 1.0

The plot of the decision boundaries produced by the above calls:

Generating simulated data.

Feel free to play around with the code to generate data to make it harder, add more dimensions, etc. You can then generate new data as follows (while in the simdata directory):

$ Rscript generate_linear_data.R
$ python  generate_moon_data.R
$ Rscript generate_saturn_data.R

The R scripts generate both train and test sets. For the moon data, you'll need to split the output into train and eval files using the Unix head and tail commands.

Creating plots of the data.

To prepare the blog post for this repository, I created a few R scripts to plot data. They are simple, but I figured I'd include them in case they are useful starting points for others for changing things or plotting related data.

Go into the simdata directory.

Open plot_data.R in an editor and uncomment the data set you'd like to plot, save it, and then run:

$ Rscript plot_data.R

For plotting the image with the hyperplane, start up R and then provide the command source("plot_hyperplane.R") to R.

For plotting the graph relating the number of hidden nodes to accuracy, start up R and then provide the command source("plot_hidden_curve.R") to R.

Visualising the model and training using Tensor Board.

TensorFlow includes a visualisation tool called TensorBoard that allows you to visualise the directed graph of your model.

The script annotated_softmax.py reproduces the simple softmax regression, but also includes annotations that will write logging information to a folder called try_tf_logs.

After running annotated_softmax.py, you can point TensorBoard at the try_tf_logs directory to launch a browser-based visualisation of the model and training.

$ python annotated_softmax.py --train simdata/linear_data_train.csv --test simdata/linear_data_eval.csv --num_epochs 5 --verbose False
Accuracy: 1.0

$ tensorboard --logdir=try_tf_logs/
Starting TensorBoard  on port 6006
(You can navigate to http://0.0.0.0:6006)

About

Simple code for trying out TensorFlow with simulated datasets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 84.2%
  • R 15.8%