The task of answering textual questions with the help of deep learning techniques is currently an interesting challenge. Although promising results have been achieved in previous works, these approaches leave much room for further considerations and improvements. This thesis deals with the question, how a system can be realized, which is able to capture and process textual contents, and to draw the right conclusions for answering multiple-choice questions with the help of modern methods. For this, current techniques such as convolutional neural networks and attention mechanisms are used and tested on the benchmark datasets MovieQA, WikiQA and InsuranceQA, three corpora with question-answer entries from the domains movies, Wikipedia resp. insurances, each with a slightly different task. The implementation is done using the framework TensorFlow; For the representation of the textual content, pre-trained word vectors of the tool GloVe are used. In addition to improving the system, this work also aims to analyze and evaluate its learning behavior. This is done with the aid of so-called adversarial examples, where by modifying textual context information it is checked whether the neural network concentrates on the correct content when answering a question, and at which degree of manipulation a successful performance of the neural network gets impossible. At the same time, the limitations of such text comprehension systems are shown, which are often able to compare text sequences, but do not develop a deeper understanding of the meaning and content of the inputs. The text comprehension system created in this work achieves a new state-of-the-art for MovieQA with an accuracy of 82.73% correctly answered questions.