The Keras Functional API is the only way to build models that have non-linear topology, such as having multiple inputs, multiple outputs, or shared layers.

Let’s look at a simple example. Suppose we want to build a model that takes an image and a sequence of text, and combines them to make a prediction.

from tensorflow.keras.layers import Input, Dense, Embedding, LSTM, concatenate
from tensorflow.keras.models import Model

# Define the input layers
image_input = Input(shape=(224, 224, 3), name='image_input')
text_input = Input(shape=(50,), name='text_input')

# Process the image input (e.g., using a pre-trained CNN, here simplified)
# In a real scenario, you'd use a pre-trained model like VGG16 or ResNet
x = Dense(128, activation='relu')(image_input)
# Flatten the image features for concatenation
x = tf.keras.layers.Flatten()(x)

# Process the text input
# Embedding layer to convert word indices to dense vectors
embedded_text = Embedding(input_dim=10000, output_dim=64)(text_input)
# LSTM layer to process the sequence of word embeddings
text_features = LSTM(128)(embedded_text)

# Combine the processed inputs
combined = concatenate([x, text_features])

# Add a few more layers to the combined features
z = Dense(64, activation='relu')(combined)
output = Dense(1, activation='sigmoid')(z)

# Create the model
model = Model(inputs=[image_input, text_input], outputs=output)

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

model.summary()

This code defines two distinct input layers: image_input and text_input. The Input layer is the starting point for any Keras Functional API model. It specifies the shape of the data that will be fed into the model.

After defining the inputs, we create separate processing paths for each. The image path uses a Dense layer (in a real model, this would be a convolutional base) and then Flatten to prepare it for concatenation. The text path uses an Embedding layer to convert integer word indices into dense vector representations, followed by an LSTM layer to capture sequential information.

The concatenate layer is where the magic happens. It merges the outputs of the image and text processing paths into a single tensor. This merged tensor then feeds into further Dense layers for final prediction.

Finally, we instantiate the Model by specifying its inputs and outputs. This Model object can then be compiled, trained, and evaluated just like any other Keras model.

The true power of the Functional API lies in its ability to define arbitrary graph structures. You’re not limited to a linear stack of layers. You can have branches, merges, and even shared layers, where the same layer instance is used multiple times in the graph. This is crucial for models like Siamese networks or multi-task learning setups.

Consider a scenario where you have a single feature extraction layer that you want to apply to two different inputs. You can instantiate the layer once and then call it on each input:

from tensorflow.keras.layers import Input, Dense, concatenate
from tensorflow.keras.models import Model

shared_dense = Dense(64, activation='relu')

input1 = Input(shape=(128,))
input2 = Input(shape=(128,))

shared_output1 = shared_dense(input1)
shared_output2 = shared_dense(input2)

combined = concatenate([shared_output1, shared_output2])
output = Dense(10, activation='softmax')(combined)

model_with_shared_layer = Model(inputs=[input1, input2], outputs=output)
model_with_shared_layer.compile(optimizer='adam', loss='categorical_crossentropy')
model_with_shared_layer.summary()

Here, the shared_dense layer is applied independently to input1 and input2. When training, the gradients will flow back through both paths, updating the weights of the shared_dense layer based on its performance on both inputs. This is incredibly memory-efficient and can lead to better generalization if the shared features are relevant to both tasks or data sources.

When you use the Functional API, you’re essentially defining a computation graph. Each layer call layer(input) returns a tensor, and you chain these tensor operations together. This explicit graph definition gives you fine-grained control over the model architecture, making it ideal for research and complex, non-standard model designs.

The most surprising aspect of the Functional API is that even though it allows for complex graph structures, its underlying implementation still relies on the sequential model’s "layer stack" concept internally. When you call model.summary(), you’re seeing a flattened representation of this graph, but the actual execution is a directed acyclic graph (DAG).

The next step after mastering complex model architectures is understanding how to serialize and deserialize these models, especially when they involve custom layers or complex graph structures.

Want structured learning?

Take the full Tensorflow course →