The example is clear. My question is about how it'd work if a dropout layer were added to the sub-network.
The question arises because dropout behaves differently in training (forward) and predicting (predict). During training the layer randomly sets input elements to zero given by the dropout mask each time it is invoked and at prediction the output of the layer is equal to its input (https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.dropoutlayer.html?s_tid=doc_ta). Therefore, it'd reason that the mask would be different for each image in the input images pair! But this is NOT what we want!