Can you believe it? Interactive narrative in VR documentary

Posted filed under Featured, Research.

Reading Time: 9 minutes

In the current participatory media culture, the development of digital media ecologies has inevitably changed how people experience their surroundings. Digital devices have become extensions of the human body, working as a medium to perceive and replicate reality (Yoh, 2001). Jaron Lanier coined the term “virtual reality” and pioneered its early development. Now, people can wear a virtual reality headset and be ‘present’ at any scene or event hundreds of years ago or thousands of miles away. So, for the film industry, such innovative interactive technologies can revolutionise the viewing experience, making the audience feel perfectly synchronised with the cinematic space and forget the real world in the sense of immediacy. In other words, VR cinema creates an illusion that allows users to enter the space  they used to watch on screen (Nash, 2018). Therefore, the concept of reality in this space is subverted. Such reformation of the documentary has happened in two fundamental areas: ‘witness’ and ‘narrative’.

In traditional documentaries, the objectivity lies in the authentic factual content constructed by filmmakers, which is ‘projecting the lived world’ (Winston, 2019). It is central to making the audience believe that the actual events on screen are happening or have happened in the real world. Witnesses are, therefore, the glue that binds the story to the lived world (Winston, 2008). In contrast to the traditional screen, virtual reality documentaries offer an all-encompassing live reenactment of witnessing. This provides an immediate sense of presence when the non-fictional feature stories are happening. However, such 360-degree immersion means the absence of several cinematic techniques, such as type of shot, camera movement and montage. For example, the transferring of scenes in traditional documentaries could be achieved by editing within the punctuation of attention, but the ‘see-it-all’ in VR documentaries makes it nearly impossible to jump into a new scene smoothly without noticing.

Therefore, amid the panoramic spatial switch and narrative coherence, the question of how to ensure that the audience’s sense of presence in virtual non-fictional space is not diminished must be addressed. How real stories are narrated in virtual reality is essentially a continuum of the viewer’s virtual space experience.

‘Experience is a cover-all term for the various modes through which a person knows and constructs reality. These modes range from the more direct and passive sense of smell, taste and touch to active visual perception and the indirect mode of symbolisation’ (Tuan, 1977, pp. 8).

In 1978, a hypermedia experiment project called Aspen MovieMap was conducted in Aspen, Colorado, USA, which applied panoramic and moving stop-motion photography, simultaneous recording systems, computers with laser players and a touch screen playback system to create the first VR world in history (Lippman and Mohl, 1978). The advent of the project was a great inspiration to atone for the absence of an illusionary presence in the narrative space of the documentary.

First, the project used real-life footage and 3D technology to reconstruct the entire VR environment. This process replaces traditional documentary filmmakers’ subjective tendencies and spatial perceptions in controlling the filming. Second, the audience interacted with the virtual environment through tactile controllers to freely explore their journey on the map. This interaction allows viewers to intervene in the space not by passively watching the screen but by actively touching and moving with their bodies —their status shifts from spectators to participants within the main threads of the film’s narrative.

The audience using the Aspen MovieMap (Computer History Museum, 2020)

Although Aspen MovieMap provides an inspiring start for the VR documentary interactive storytelling format, the current development is limited by technology in further ‘controlling’ the narrative. For example, Border: A VR Ride (Conway, 2018), directed by Abigail Conway and The Waiting Room VR (Mapplebeck, 2019), directed by Victoria Mapplebeck, employed panoramic camera technology to create a 3D virtual environment, which allows a viewer only to change their position and perspective in observing an actual story, but not provide an interactive journey for the audience to explore the narrative development by themselves. Therefore, the viewer is prevented from intervening in the active narrative. It generates a gap in the coherence within the spatial presence of the audience.

Based on the four progressive processes identified by Mark Stephen Meadows, after the phases of observation and exploration, audience engagement in imagining the narration and the deep interactive experience are the two key components that help them to further immerse into the narrative space (Meadows, 2002). Therefore, how to enable VR documentary viewers to engage in the narrative actively is one of the trickiest but most significant challenges in maintaining the immersive sense of presence in the virtual space of a non-fictional story.

‘Presence is defined as the subjective experience of being in one place or environment, even when one is physically situated in another. As described by teleoperators, presence is the sensation of being at the remote worksite rather than at the operator’s control station’. (Witmer and Singer, 1998, p. 225)

Witmer and Singer argue that a sense of presence is a psychological state to throw someone’s body into a mental space. The embodied physical behaviour can influence the feeling of involvement in a virtual locale (Witmer and Singer, 1998). Eyal (2014) uses the hook model to explain how interactive visual production can strengthen users’ participation and get them deeply involved. This model comprises four steps: the trigger, action, variable reward and investment. From the Hook Model, audience engagement in a VR documentary can be triggered by the motivation to actively reveal the truth like a detective. When a player starts performing the physical behaviour that links to the narrative developing in film, he/she can obtain the variable reward and instant response from the virtual space, which unfolds some clues of the factual story.

The subject in the VR documentary can react to the audience’s action in the physical space, which turns the camera into the eyes of the avatar of the audience being in the narrative space. The embodied action can lead the narrative and expand some fragmented narrative lines, allowing the audience to enhance their involvement in this interactive watching experience and isolate themselves from the physical space. During this process, engagement in the narrative enables the player to shift attention to the stimulus of the environment, thereby enhancing the sense of presence.

The Hook Model (Eyal, 2014)

Dreaming Walk-on is an in-development VR documentary VR project filmed with Insta360 and made with Unity. The documentary’s story is about three girls from low-income families chasing their dreams to become movie stars. The film contains numerous scenes and narrative branches. The audience’s physical interactions in different locations will trigger corresponding narrative branches, allowing them to see various aspects of the subject’s daily life. This project explores whether the audience’s embodied behaviour could impact an overall sense of presence towards interactive narration in the non-fictional story. All interactions designed for the project are based on real-life behavioural experiences, for example, shaking hands, waving to somebody, and picking up items. During the postproduction stage, the Leap Motion is attached to the HTC VIVE to capture the hand gesture of users for controlling the project, applying Unity as a working engine in gathering their responses to trigger the following narration in the virtual environment.

Concept drawing

According to Mauss (1973), and his repetition theory of interrelation between body and technology, people’s bodies apply basic behaviours, such as waving, picking up and clapping to new technological environments through repetition, interaction and imitation of existing embodied experiences in a specific context. In the interaction between body and environment, the body is the first sensor to receive a response from the space and create a sense of presence. It is the subject of perception and constantly expands such a perception outwardly. Thus, the audience intervenes in the virtual space as the narrative participant at the beginning of the factual story. As the subjective participant leads the narration, the audience can recreate their embodied experience in the lived space through the action in the interaction. Their participation can be responded to with a sense of presence in the virtual space, thereby enhancing the experience of the interactive narration. Take a specific scenario as an example. When the viewers arrive at the train station, they are required to wave to the crowd. After the gesture is recognised, the viewers can meet one of the subjects for the first time on the train platform. Subsequently, the audience can choose to pick up the subject’s luggage or shake hands with the subject. The succeeding interview footage scenes will be shown differently based on the choices made by the viewer. The viewers could receive different background information in each narrative branch in the interview.

Screenshot of the Dream Walk-on Project

In this VR documentary project, the physical interaction of the embodied experience not only restores the prior understanding of the actual space but also curbs the sense of detachment owing to the cinematic juggling in the environment. In modern documentaries, spatial scenes often change as the real story is followed over long periods. Therefore, documentary editing ensures that the grouping of shots is harmonious and rhythmic, building the narrative of the overall story and providing its development in terms of the overall theme, drama, psychology and time.

Cutting on movement can be as simple as editing between two shots within a scene where there is a slight hand or body shift at the point of the cut so that the viewer sees the movement and becomes less aware of the edit itself (Mitry, 2000, pp. 174).

By matching the graphics or movements in the film, the audience’s eyes naturally follow the focus from shot to shot and sequence to sequence. In a VR documentary, the panoramic visual space can take the audience out of the focal position of attention, thereby preventing them from focusing on the dynamic editing moments in the narrative. However, when the physical actions of the audience become a part of the narrative progression, movement can work as the focus of the scene editing, thus maintaining narrative coherence and their sense of presence. The viewers would omit the transferring of the scenes in the interactive moments because their attention is centred on the action they have performed in the space.

In Dream Walk-on, as the subject’s story progresses to the point where they are ready to leave for home, the audience accompanies them out of the office building and onto the street, where they can interact with the subjects. They can choose to take a taxi by waving one hand, take a bus by waving both hands or walk home with both hands simulating stepping. The three journeys occur at different times during the filming phase and are re-edited together after being filmed separately. The subject expresses different feelings and perceptions of work and life on different journeys, which are unique and distinct. Thus, the interaction of the audience’s gestures can influence the narrative’s different paths in following the story and ensure continuity during the scene jumps. Such interactive gestures can grab the audience’s attention as an immersive space with a panoramic view, directing the audience to follow the development of narration and enhancing credibility in factual storytelling. Thus, this embodied physical behaviour to interact with the virtual environment can give the audience a high level of sense of presence during the sudden changes in the scene.

Screenshot of the Dream Walk-on Project

This project shows that the physical bodies and mediated bodies integrally impact the immersion of the narrative in a VR documentary. When viewers can actively influence the story’s progression through their physical actions in the virtual space, their status as witnesses in the virtual space can be enhanced. Their behaviours and feedback in the virtual environment corresponding to their embodied experiences, shifting from a ‘partial disembodiment’ perspective to an ‘embodied subjective’ perspective.

This result suggests the opposite of Don Ihde’s binary view of body and technology (Ihde, 2012). Although ‘the thinness of the virtual body makes it impossible for it to reach the thickness of the physical body’ (Ihde, 2012), when the virtual mediated space is close to the physical experienced space, the body within the interactive narrative can grow an emerging auteur in establishing neo–audiovisual language of this new medium of VR. The behavioural feedback, the spatial cognition of the natural area in the virtual environment of VR documentary, makes the phenomenology of embodiment and disembodiment move from opposition to unity.


Computer History Museum. (2010). History Google Maps Aspen Movie Map Experimenting. California: Computer History Museum.

Conway, A. (Director). (2018). Border: A VR Ride. [Interactive Film]. London: East City Films.

Eyal, N., 2014. Hooked: How to build habit-forming products. Penguin.

Ihde, D. (2012). Technics and praxis: A philosophy of technology (Vol. 24). London: Springer Science & Business Media.

Lippman, A. and Mohl, R. (1978). The Aspen movie map. Cambridge: MIT ARPA.

Mapplebeck, V. (Director). (2019). The Waiting Room VR. [[Interactive Film]. London: East City Films.

Mauss, M. (1973). Techniques of the Body. Economy and society, 2(1), pp.70-88.

Meadows, M.S., 2002. Pause & effect: the art of interactive narrative. Pearson Education.

Mitry, J. (2000) The Aesthetics and Psychology of the Cinema (trans King, C.) Indiana University Press, USA. pp.168 -174.

Nash, K. (2018). Virtual reality witness: exploring the ethics of mediated presence. Studies in documentary film, 12(2), pp.119-131.

Tuan, Y.F., 1977. Space and place: The perspective of experience. U of Minnesota Press.

Winston, B., 2008. Claiming the real II: Documentary: Grierson and beyond (pp. 1-336). BFI.

Witmer, B.G. and Singer, M.J. (1998). Measuring presence in virtual environments: A presence questionnaire. Presence, 7(3), pp.225-240.

Yoh, M.S. (2001). The reality of virtual reality. In Proceedings Seventh International Conference on Virtual Systems and Multimedia (pp. 666-674). IEEE.

Ruohan Tang is a PhD Candidate in Film at the University of Southampton, sponsored by the China Scholarship Council (CSC) Arts Talent Program. He was working as International Documentary Development Specialist in UK-China Film Collab (2020-2021) and Training Officer in The Media, Communication and Cultural Studies Association PGN (2020-2021). His PhD research focuses on the participatory documentary in China. His research interests include media technology, interactive media, VR documentary and technicalisation in the documentary industry.

Related posts