In this chapter, we provide a very brief description of TensorFlow's basic building blocks and concepts that are used in building sophisticated models. Here we also give an overview on how the models should be structured.
TensorFlow is an open-source software library from Google for numerical computation using data flow graphs. It is the library that allows us to
- use variety of programming languages to build deep learning models, as it has Python, C++, Java and Go APIs,
- deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API,
- visualize learning, embedding, graphs and histograms using TensorBoard,
- forget about derivative computations by hand as it has auto-differentiation.
In addition, it has a large community and many projects are already using TensorFlow.
As TensorFlow matures it simplifies its interfaces, for example, TensorFlow Estimator (tf.estimator) and TensorFlow Learn (tf.contrib.learn), provide readily available models that users can simply call. This was purposely created to mimic scikit-learn
for deep learning or “to smooth the transition" from the scikit-learn
world of one-liner machine learning into the more open world of building different shapes of machine learning models.
Note: TensorFlow Learn was originally an independent project called Scikit Flow (SKFlow).
Both TensorFlow Estimator and Learn allows you to load in data, construct a model, fit your model using the training data, evaluate the accuracy, just using a single line of code. Some models that you can call using one line are LinearClassifier, LinearRegressor and DNNClassifier.
Google has some good tutorial on how to build model using TensorFlow Estimator.
Note: TensorFlow Contrib module, which TensorFlow Learn is part of, contains volatile or experimental code.
Another notable addition to the TensorFlow is the implementation of the Keras API, that significantly reduces the number of lines and improves readability of the code. See TensorFlow Keras.
However, the primary purpose of TensorFlow is not to provide out-of-the-box machine learning solutions. Instead, TensorFlow provides an extensive suite of functions and classes that allow users to define models from scratch. This is more complicated but offers much more flexibility. You can build almost any architecture you can think of in TensorFlow. For that reason, we will not use TensorFlow Estimator, Learn and Keras interfaces, but stick with basic TensorFlow. In addition, we will avoid using Python classes and functions where possible as it might confuse some readers if they are new to Python. However, note it is often better to define models as a class.
Most notable difference between TensorFlow and other libraries is that TensorFlow does all its computation in graphs. A TensorFlow graph is a description of operands and operations that are required to perform a task. This means, that TensorFlow programs separate the definition of computations from their execution. For more details on computational graphs see the following:
So, a computational graph is a series of TensorFlow operations arranged in a graph of nodes. In the graph, nodes are called ops which is short-hand for operations. An op takes zero or more Tensors, performs some computation, and then again produces zero or more Tensors. As you might suspect, in TensorFlow, Tensor is the basic object and it uses a tensor data structure to represent all data - only tensors are passed between operations in the computation graph. You can think of a Tensor as an n-dimensional array or list. The Tensor has a static type, a rank, and a shape. To learn more about how TensorFlow handles these concepts, see the Rank, Shape, and Type reference.
For example, in the case of the patient records, _Tensor_could be a three-dimensional array with dimensions [patients, record_lengh, events] or in case of images it is a four-dimensional array with dimensions [images, height, width, colors].
As the example, to build a graph we start with ops that do not need any input, such as Constant
, and pass their output to other ops that does a computation. The ops constructors in the Python library return objects that stand for the output of the constructed ops. You can pass these to other ops constructors to use as inputs. The TensorFlow Python library has a default graph to which ops constructors add nodes. The default graph is sufficient for many applications.
Note:
tf
in all scripts that follow stands for tensorflow.
Thus, first we define the computational graph by adding nodes to the default graph, in Python it could be written as follows:
import tensorflow as tf
# Create a Constant op that produces a 1x2 matrix. The op is
# added as a node to the default graph.
# The value returned by the constructor represents the output
# of the Constant op. It is matrix of shape 1x2.
matrix1 = tf.constant([[1., 2.]])
# Create another Constant that produces a 2x1 matrix.
matrix2 = tf.constant([[3.], [4.]])
# Create a Matmul op that takes "matrix1" and "matrix2" as inputs.
# The returned value, "product", represents the result of the matrix
# multiplication. Output is a matrix of shape 1x1.
product = tf.matmul(matrix1, matrix2)
The default graph now has three nodes: two tf.constant()
ops and one tf.matmul()
op. If we try to draw it on a paper then it could look stomething like this:
If we try to execute the code presented above, we, unfortunately, will not get a useful answer. In Tenserflow to actually perform the computation and get the result, we have to launch the graph in a session. Thus, to multiply the matrices and get the result of the multiplication, we have to create a Session
object without arguments, which launches the default graph.
# Launch the default graph.
sess = tf.Session()
Then call run()
method which executes of all three ops in the graph,
result = sess.run(fetches=product)
print(result) # expected value is [[ 11.]]
After which Session
has to be closed by running sess.close()
. However, it is also possible to enter the Session
with a with
block. Doing so, the Session
will close automatically at the end of the with
block and thus sess.close()
is not needed:
with tf.Session() as sess:
result = sess.run(fetches=product)
print(result)
You sometimes see InteractiveSession()
instead of Session()
. The only difference is that InteractiveSession()
makes itself the default Session
and thus can call Tensor.eval()
and/or Operation.run()
without explicitly calling the Session
every time you want to compute something. This is a convenient feature in interactive shells, such as Jupyter Notebooks, as it avoids having to pass an explicit Session
object to run ops every time. However, it is complicated when you have multiple Sessions
objects to run. For more information see here
TensorFlow takes in Python native types such as Python boolean, numeric values (integers, floats) and strings. Single values will be converted to 0-D tensors (or scalars), lists of values will be converted to 1-D tensors (vectors), lists of lists will be converted to 2-D tensors (matrices), and so on. However, TensorFlow also has its own data types, such as, tf.int32
, tf.float32
, for more detailed description see here. These types are actually based on those of NumPy and thus, in most cases, they can be used interchangeably.
Next chapter will show how to implement logistic regression in TensorFlow, if you wish to return to previous chapter press here.