The presentation of the statistical analyses is divided into four main sections. First, the descriptive statistics of the variables collected are reported. In the next step, the testing of hypotheses one and two is adequately presented in the context of a MANOVA. The moderation analyses concerning hypothesis three can be found in the following section. Finally, to answer hypothesis four, t-tests for independent samples, correlation, and mediation analyses are presented.
5.1. Descriptive Characteristics of the Sample
Table 1 shows the measurement points, ranges, sample sizes, means, standard deviations, and Cronbach’s alphas as a measure of internal consistency for each scale or item.
Concerning prior technical knowledge, in the HMD-VR experimental conditions, 70% reported to previously had experience with VR, whereas in the DVR experimental conditions, 59% reported previous experience with VR.
Regarding the instruments used to operationalize the learning outcomes from the pre- to the post-measurement point, the following can be stated: Across all experimental conditions, from the first to the second measurement point, there is an increase in knowledge and perspective-taking into Anne Frank’s situation (see
Figure 5).
In the knowledge test, the students received slightly more than half of the maximum twenty points to be achieved on average. However, a very high dispersion can be observed. The performance of the students in the knowledge test is therefore very different. Considering the overall evaluation of the VR application, high mean values and low dispersion are detected averaged over all experimental conditions. Just a few outliers were not satisfied and would not recommend the VR application to others. Regarding the measuring instruments used to operationalize learning processes (i.e., presence, flow), the following can be reported: The experience of flow averaged over all four experimental conditions and compared to the values reported by Rheinberg et al. for particularly absorbing and smooth activities such as graffiti spraying [
67] can be classified as very high. Particularly the absorbedness reached very high values on average. Accordingly, the students seem to have been captivated by the VR experience across all media and methods conditions. Moreover, the experience of presence is to be classified as high, averaged over the experimental conditions. This applies to both subscales. Consequently, the students felt present in Anne’s hiding place and experienced themselves as acting persons within the VR environment. The results are comparable to those reported by Volkmann et al. [
85].
5.2. Main Effects
To test hypotheses one and two, a two-way MANOVA was used. As independent variables, both the technology and the instructional method were integrated. As dependent variables, the knowledge test and both items for the overall evaluation were used. In addition, two new variables are calculated to operationalize learning. Both perspective-taking and knowledge were queried at both measurement points by two single items. To measure the increase in perspective-taking and knowledge from the first to the second measurement time, two new variables were calculated that allow a pre-post comparison. As already shown in
Table 1 and
Figure 5, the values at the second measurement are numerically higher than at the first measurement. Therefore, the pre-value was subtracted from the post-value. Positive values on these two new items mean an increase in perspective-taking and knowledge, negative values a decrease. On average, knowledge improved by 2.63 units across all test conditions (
SD = 2.25) and the perspective-taking by 1.90 units (
SD = 2.37). The two new items were also included as dependent variables into the MANOVA.
The MANOVA showed statistically significant differences between the stages of variable technology (
F(5, 108) = 8.67,
p < 0.001, partial
η2 = 0.39, Wilk’s Λ = 0.61) and between the stages of the variable instructional method (
F(5, 108) = 2.68,
p < 0.001, partial
η2 = 0.17, Wilk’s Λ = 0.83) for the combined dependent variables. Thus, technology is the factor in the MANOVA that can explain the most variance (39%), following instructional method that can explain 17%. The interaction between technology and instructional method has not become significant (
F(5, 108) = 0.82,
p = 0.59, partial
η2 = 0.06, Wilk’s Λ = 0.94). Since significant findings were found for both factors of the analysis, post hoc multifactorial ANOVAs for each dependent variable were calculated. The Bonferroni correction was used to decrease the risk of a type I error when making multiple statistical tests.
Table 2 shows the results of the independent variable technology,
Table 3 the results of the independent variable instructional method.
Hence, a significant main effect for the variable technology was found averaged across all learning indicators. However, a superiority of HMD-based VR was uncovered only for two evaluative indicators, and for one affective learning indicator, even if not statistically significant. The results for the knowledge test are different: Here, the students in the DVR conditions performed significantly better. Consequently, subhypothesis a could rather not be supported by the data of this study. With respect to the acquisition of knowledge (subhypothesis b), the findings are ambiguous. As expected, no differences between the two forms of technology (DVR vs. HMD-VR) could be found for one of the cognitive learning indicators. The assumed null hypothesis would be accepted. The situation was different for the indicator knowledge test. Here, there was an advantage for those who were in the DVR conditions. This difference was highly significant. The difference between the two groups is approximately one standard deviation unit. This means that, on average, the students in the DVR conditions solved two more questions correctly than students in the HMD-VR conditions. The significant difference between the two groups could possibly be explained by the fact that cognitive capacities that were tied up by many environmental details and by the high interactivity of the HMD-VR technology were free in the DVR conditions, and the focus could be placed on the acquisition of expertise [
64,
87]. As expected according to subhypothesis c, students in the HMD-VR experimental conditions were significantly more satisfied (4% variance explanation) and would be more likely to recommend the application to classmates (6% variance explanation). It can be assumed that this finding is also partly due to the novelty effect [
88,
89]. This effect was certainly more pronounced among students in the HMD-VR conditions than in the DVR conditions. Nevertheless, learning with HMDs seems to have been more emotionally engaging and motivating for the students. According to recent assumptions of motivation theories [
90,
91], multimedia learning environments trigger, when they are perceived as appealing, a high initial situational interest, which in turn can have a positive effect on subsequent learning processes.
The main effect for the variable instructional method was also significant averaged across all learning indicators. With respect to two cognitive learning indicators, performance was significantly better in the exposition conditions than in the exploration conditions. Hence, students in the exposition conditions solved more questions correctly in the knowledge test and estimated their knowledge gain from the first to the second measurement point as greater. In all, 6% and 3% of the variance in the two learning indicators can be explained by the instructional method, respectively. The differences in the knowledge test can possibly be explained by considering that the students in the exposition conditions, through the instructions given, encountered the content that was later asked in the knowledge test. All of the VR content was explored in the exposition conditions. It was unlikely to have missed relevant content. Students in the exploration conditions, on the other hand, were free to choose what content they wanted to engage with and to what intensity. Often in the exploration conditions, certain elements were examined in a time-consuming manner, while others were neglected, often also depending on personal interests. Consequently, the undirected second hypothesis can only be supported to a limited extent, insofar as the learning outcomes of the two experimental conditions differ only for two indicators. In both significant findings, however, the same tendency can be seen that students in the exposition group have an advantage over students in the exploration group with respect to knowledge acquisition.
5.3. Moderating Effects
In the current study, it has been assumed that prior knowledge about the topic and prior technical knowledge determine the relationship between the independent variable instructional method and the learning outcomes as dependent variables. In order to adequately check the assumed moderating effects of the two control factors, the PROCESS macro for SPSS was used [
86]. Linear regressions were performed. In the regressions, as independent variable the instructional method and as moderators prior knowledge about Anne Frank and prior technical knowledge were included. For each learning indicator (i.e., knowledge test, knowledge pre-post comparison, perspective-taking pre-post comparison, satisfaction, recommendation), one regression has been calculated.
Table 4 provides an overview of the results. For each dependent variable, the statistics on the interactions of instructional methods and the two control variables are shown.
The majority of statistical analyses could not find any moderating effects of the control factors. Only for one learning indicator (i.e., the subjective assessment of knowledge gain), the interaction between the prior knowledge and the instructional method could significantly precede the learning outcomes. In all, 1.6% of the total variance could be explained by the interaction. However, further analysis with scatterplots showed that while the effects were in the expected direction (expertise reversal effect [
72]), they were not significant. Accordingly, individuals with much prior knowledge are more likely to benefit in the exploration conditions, while those with little prior knowledge are more likely to benefit in the exposition conditions. However, the majority of the results indicate that prior knowledge is insignificant in determining whether someone is other- or self-directed in exploring a VR environment. Hypothesis three cannot be supported.
5.4. Mediating Effects
In the last section of the results report, the focus is on hypothesis four dealing with mediation analyses. A mediator explains the relationship between an independent and a dependent variable. Thus, a mediation analysis is also an analysis of causal effects. In addition to direct effects, indirect effects via third-party variables are examined. According to Baron and Kenny, four assumptions for the existence of a mediation must be fulfilled: First, between the independent variable and the dependent variable, there is a direct relationship (path c). This path is also called the total effect. Second, the independent variable must correlate with the mediator (path a). Third, the mediator and the dependent variable must be connected (path b). Fourth, in full mediation, the direct path between independent and dependent variable loses its significance (path c’). If the fourth step is not fulfilled, it is referred to as partial mediation [
92].
Figure 6 represents the principles of mediation analyses.
In this study, it has been assumed that latent processes (i.e., presence, flow) while learning in VR can significantly explain the relationships between technology and learning indicators.
The first assumption of Baron and Kenny, namely the direct relationship between technology and the various learning indicators (path c), has already been sufficiently verified by the MANOVA and the subsequent ANOVAs in
Section 5.2. The main effect technology for the combined dependent variables was significant.
To test the second assumption (path a), the relationships between technology and learning process variables were investigated. For this purpose, multiple
t-tests for independent samples were carried out. Here, the method tests whether there are differences in the feeling of presence and flow regarding the stages of technology (HMD-VR vs. DVR). The results of the
t-tests are shown in
Table 5. A positive mean difference indicates that students in the HMD-VR conditions achieved higher values than students in the DVR conditions.
Regarding the learning processes, statistically highly significant differences could be found between the HMD-VR group and the DVR group. Students in the HMD-VR conditions experienced more flow. This applies to both subscales. Regarding the physical presence subscale, the students of the HMD-VR group felt more presence than those of the DVR group, but not regarding self-presence.
To test the third assumption of Baron and Kenny (path b), descriptive correlation analyses were used. The correlative relationships were determined due to a lack of normal distribution with the nonparametric Spearman rank correlation. The results of the correlation analyses are shown in
Table 6. Various statistically significant correlations could be determined. Thus, the experience of flow seems to be related to increased satisfaction, greater tendency to recommend, and better performance in the knowledge test. The presence experience also correlates significantly positively with satisfaction and recommendation and, depending on the subscale, positively with the performance in the knowledge test.
In the following, the fourth assumption of Baron and Kenny and thus the mediation assumptions are checked inferential statistically. In the mediation model, technology was included as an independent variable, the two subscales of flow and physical presence as mediator variables, and satisfaction, recommendation, and the knowledge test as dependent variables. The self-presence subscale was excluded due to insufficient correlations with the technology and the learning indicators. Based on the results of the MANOVA and ANOVAs, the newly formed variables for pre-post comparison were also excluded, as no significant associations with the technology could be found. As a result, a total of nine mediation models were examined. Like the moderation analyses, the mediation analyses were also carried out using Hayes’ macro PROCESS and linear regressions [
86].
For example, a statistically significant mediation could be detected for the mediator flow (here: smooth automated flow) and the dependent variable satisfaction. Within the mediation analysis, a total effect of technology on satisfaction was uncovered (c = −0.58 **), directed in such a way that those in the HMD-VR group were significantly more satisfied than those in the DVR group. After the mediator was included in the model, the technology could predict the mediator, i.e., flow, significantly (a = −1.19 ***). Students in the HMD-VR condition experienced significantly more flow than students in the DVR condition. Flow, in turn, predicted the satisfaction of the students (b = 0.29 ***). The more flow was experienced, the more satisfied the students were. The path c’, i.e., the direct effect of technology on satisfaction, was not significant (c’ = −0.23). Therefore, a full mediation could be established. After the mediator flow was added to the analysis, the correlation between technology and satisfaction lost its statistical significance. Consequently, there is no statistically significant direct link between technology and satisfaction, but an indirect connection that can be explained by flow. Accordingly, the relationship between technology and satisfaction is fully mediated by experiencing flow (indirect effect ab
smooth automated flow = −0.38; 95% CI [−0.571, −0.169]). An illustration of the causal relationships is shown in
Figure 7. Overall, in four out of the nine analyses carried out, a significant full mediation could be revealed (2x physical presence, 2x smooth automated flow), but only for the evaluative learning indicators.
Mediating effects through learning process variables were discovered a few times. Consequently, the experience of flow and presence within the VR environment can significantly explain the cause–effect relationships between VR technology and parameters of learning, even if only the effects for evaluative indicators could be found.
Nonetheless, subhypothesis a can be supported. Students experienced more physical presence when using an HMD than using a desktop, meaning that the first group mentioned tended to feel more spatially present in the virtual hiding place. HMDs largely block out external stimuli from reality, whereas on the desktop the real environment (e.g., the classroom) is still present. This is consistent with a study by Makransky and colleagues [
64]. In addition, in this study, experiencing presence comes along with higher satisfaction and higher probability to recommend. Again, no mediation effects could be uncovered for other learning indicators. Contrary, in their study, Makransky et al. found negative effects of increased presence on learning.
Moreover, subhypothesis b can be accepted. The cause-and-effect relationships are identical to those reported for presence. Significantly more flow was experienced in the HMD-VR conditions than in the DVR conditions. When more flow was experienced, the application was rated significantly better.
It can be concluded that some learning process variables can significantly explain the causal relationships between the VR technology used and indicators of learning. These mediation effects are particularly relevant because presumed significant direct effects of technology on indicators of learning lose their significance when latent learning processes are also considered. Studies that fail to control for such learning processes occurring in VR misinterpret significant results between two variables as causal, when in fact other variables, namely latent learning processes, are responsible for or can systematically explain this relationship. Therefore, the results of the current study should encourage the inclusion of learning process variables such as flow and presence to avoid confounding.