I think I found a relevant MATLAB example (Train Network on Image and Feature Data) which could help me. The URL is here: https://www.mathworks.com/help/deeplearning/ug/train-network-on-image-and-feature-data.html
In the example, the training data are converted into datastore Type via arrayDatastore and then combined into dsTrain, as seen in the picture below
data:image/s3,"s3://crabby-images/15612/15612c23bd20ba51e8dca55784af744686b71cb6" alt=""
Seems like the sequence of the combined data is the same as the input required by the neural net, as seen below
data:image/s3,"s3://crabby-images/424d0/424d091d94d590941f6b3bf3346c48923ee59b4e" alt=""
dsTrain = combine(dsX1Train,dsX2Train,dsTTrain);
dsX1Train(ImageInput), dsX2Train(rotation angle), dsTTrain(output).
Am I correct?
However, an answer from an experienced user or Mathworker would help a lot, :D.