Contents
the number of hidden units in an lstm refers to the dimensionality of the ‘hidden state’ of the lstm. the hidden state of a recurrent network is the thing that comes out at time step t, and that you put in at the next time step t+1.
How many units should I use in LSTM?
The number of units in each layer of the stack can vary. For example in translate.py from Tensorflow it can be configured to 1024, 512 or virtually any number. The best range can be found via cross validation. But I have seen both 1000 and 500 number of units in each layer of the stack.
Are there hidden cells in a LSTM diagram?
Most LSTM/RNN diagrams just show the hidden cells but never the units of those cells. Hence, the confusion. Each hidden layer has hidden cells, as many as the number of time steps. And further, each hidden cell is made up of multiple hidden units, like in the diagram below.
How are units and inputs used in LSTM?
To match your description of the diagrams, let’s define a “unit” as a collection of one of each type of neuron/gate used to make up the cell, that in theory could be wired together to make a working LSTM cell layer with a single scalar cell state and output value. These units are independent in that each has its own weight parameters.
What’s the difference between a RNN and a LSTM?
The basic difference between the architectures of RNNs and LSTMs is that the hidden layer of LSTM is a gated unit or gated cell. It consists of four layers that interact with one another in a way to produce the output of that cell along with the cell state. These two things are then passed onto the next hidden layer.
How are LSTM cells dependent on time steps?
It’s indirect because there are gates between them. Also consider that LSTM cell shares the weights for all the inputs of different time steps. Consequently each neuron in LSTM cell is dependent to the input of the current time-step and the output of the adjacent nodes of the previous time-steps.