Sound

sound (narration) The audible aspect of narration in a film. Sound is the element of realist film making that can appear least crafted and controlled, while actually needing to be diligently created and designed to match the characters perception of sound, and in doing this directing the audiences’ understanding and experience of the story. Sound in film is narrative in three main ways, it offers the sound as the characters in the film would perceive it, so focalising the story to the characters: the sound offers a close viewpoint and participation in the story for the audience, and also, alternatively, sound can be used to distance the audience from the characters and the action in the narrative. Sound articulates action, external action, what is happening that is visible on screen and often off-screen in the events and the environment of the story, and crucially sound indicates what is happening internally to characters, the character’s experiences and emotions being represented by sound through music and through the representation of subjective sound: this being the precise focalisation of sound: the sound heard and felt by a character at specific points in the narrative.

The control of sound in film is necessary for realism and this necessity can be evidenced simply by setting up a mobile phone with a recording app on a table in a room and then speaking when recording and walking around the space while the phone remains static. What will be recorded and heard on playback is the actual sound of the interior room with rustles and knocks from movement, from clothing, from footsteps, from knocking on objects in passing, and there is the noise in the room from air conditioning fans, heaters or other devices, computers, machinery and also external sounds outside the room, corridors, traffic, people passing. By moving around while speaking the recording of the voice will vary between being thin and distant, with an immediate echoing, a resonance from the voice being reflected from the walls of the room, and when the speech is closer to the microphone the voice will be louder and fuller, but the background sounds, the noises in the room will still be clearly audible.

This sound recording when the sound of the room is heard in playback and the perception of the sound during the recording when walking in the space are very different, and sound recorded in this way, with a static microphone placed without reference to the speaker, could not be used as the soundtrack for a film, unless it was for the sound in a scene that was intended to show that a recording made in this way is discordant, disruptive and confused. The recording does not filter the sound, it simply records the sound in the environment, while in contrast, a person constantly filters sounds, giving preference to what they wish to give their attention to: so in a room with many different sound sources a person will listen to and hear the person speaking over other sounds, this will be their conscious attention, and other sounds are back grounded. The sounds in the environment are are present in perception but given little conscious attention. It is the human perception of sound that narrative film mimics: it does not produce sound as it actually exists.

Being specific, setting out what sound recording for narrative film aims to achieve can be discussed with reference to the three areas of sound: voice, sounds, and music.

The reproduction of voice in a film is dominated by this being recorded so that it is distinct and close as though the person watching the film is hearing the voice as the characters in the film do: a selective perception of voice. As an example of this, in a scene with the camera outside a car with a viewpoint through the windscreen showing two characters talking inside there car it is more likely to have the sound of the voices of the characters from the sound perspective of within the car and without the exterior sounds of the passing traffic and other noises. This use of sound should seem false and counter-intuitive, but this is not the case, the sound in the film is not what would be heard in life, because the film is a narration of a story and this narration is made by following the voices, by the speech being audible. When in a film characters are dancing in crowded rooms, fighting battles, riding vehicles, in storms, working with machinery, the voice will be audible unless the narration needs to indicate that something is not heard by one character or the narration restricts the audience knowledge and in this case the audience do not have access to what is said.

This clarity given to the sound of the voice in film narration is established with the training of actors to speak with clarity, precision and pace, which is not the realism of everyday speech. If one records a stage play set in a cafe with the scene performed in a theatre and then one records a conversation between people in an actual cafe, the delivery of the dialogue is not the same. The dramatic voice-trained delivery is carefully paced, and it has a rhythm, a patterning that makes the meaning coherent, while the actual cafe conversation will stop, start without pattern, it will have little coherent flow, it will change volume, be unclear: this is not simply in the words spoken, but in the sound, the use of speech in actuality is not performed for an audience.

An actors voice when speaking a part will have an accent, mannerisms, and emotional emphasise, which is sometimes taken to be what acting is, but the trained actor’s voice will also have fluency, clarity and pace that is often unnoticed by the audience. A dialogue scene will usually have no gaps, there won’t be significant pauses between one actor speaking and the next, certainly not pauses as in actuality, and this faster, more fluent and controlled speech will be perceived by the audience as natural dialogue, because it is coherent in narrating the story. Neophyte directors and actors sometimes add long pauses and slow pace for effect, and on set, when filming, this registers as giving dialogue dramatic emphasis, but in the film on screen this pausing will be perceived as a slow and ponderous delivery. The pace of the film is set by the flow of narrative events, and even when a dialogue scene in a film is felt to be taking place in real time, as it would take place in actuality, this is not the case. A pause in speech is a gap and if this pause doesn’t have a narrative purpose, to present and add to the narrative then it will be just a gap, a stop in the story.

Other aspects of voice and sound that are controlled by the actors is that they do not overlap their speeches unless this is specifically directed for the scene. Having no overlap means that each speech is separate and clear and so the words, their expression and intonation is clearly narrated. This will also be the case for movement interfering with sound: dialogue will not be broken by the sounds of movement, especially specific sounds, except when this is required for dramatic effect. In a scene where characters are drinking from crockery cups the recordings for film narration will not reproduce spoons clinking, cups being put down, only the actors voices, with the cups and spoons making no sound or only a minimal sound: the sound of the cups will not interfere with the dialogue unless this is done for specific dramatic effect, as part of the narration.

There will be physical actions in a film scene which for realism will be reproduced as audible sound, and even loud sounds, and there can still be dialogue in this scene, but how this is reproduced in the film as narration in sound will be carefully considered: keeping the dialogue clear, or letting the dialogue be confused, and on set the standard practice is to avoid staging action with dialogue when loud sounds are occurring: the continuity will be for doors to be closed and then the character speaks, the character speaks and then starts the car engine: speech, and physical action that has specific sounds, disruptive sounds will not happen simultaneously.

The training of actors is to develop the professional habit of never stepping on another actor’s line unless directed to do so, and not speaking when making movements or doing actions that will render their speech inaudible. One of the experiences and observations made of actors in real life is that they can sometimes seem mannered or over theatrical, and this is because their voice and movement has been trained and developed for many years, so that it is marked by their performance skills: they speak precisely and clearly, they pace their speech and speak only when another person has finished speaking. Voice in acting and drama is not voice in actuality.

To achieve what is required for the voice in film narration in relation to recording methods means that there is considerable effort and expense put into sound recording during a film’s production. Sound stages, often built in the countryside, rather than a noisy city, are purpose built spaces for filming, so that there will be no extraneous noises coming from outside the sound stage to carry onto the recording and the sound stage will be designed and maintained so that they are silent spaces, there is no internal noise from fans, water pipers or other machinery. Then to ensure that voice is recorded well there will be close microphone work during filming with one or two boom mics closely following actors to record their voices.

If the framing of a shot shows one actor on camera and another off camera, if only one boom mic is used, then the voice of the actor on camera will be recorded and the sound of the off camera actor will be left as a quieter, off-microphone unusable recording. If there are two booms then there will be one boom recording sound for the on camera actor and the second for the actor off camera, each microphone closely recording just one actor. In production, the film or digital camera used for filming may be close to the actors in the scene, but the camera will be silent, either because it is digital with any fan cooling turned off during recording or it will be a film cameras that is blimped: the camera kept in a housing that prevents the sound of the camera’s mechanical operation being heard. When a shot is being filmed on a studio set the only sound that should be heard are the actors voices: nothing else. All other sound will be added later: sound fx for specific actions and atmos for the setting will all be added in post-production.

The boom mic is a microphone on a pole, that can be extended or shortened, the microphone directed and then held so that the boom operator can place the microphone close enough for a recording of the voice of the actor, and there will be a sound recordist to set the sound recording levels. This equipment ensures that the voice is carefully and closely recorded. Microphones have different recording characteristics with cardio microphones recording sound in front of the microphone in a shape that resembles the cardioid shape of the heart, like a bubble in front of the microphone that is about half a metre in diameter. The boom operator will place the microphone out of shot, but close enough to the actor so that the actor’s head/mouth will be within this half metre distance. The framing of shots places the actors head close to the top of the frame so that the sound boom can be above the framing of the shot, or alternatively, the boom can be placed from below: at chest level for an actor in close up, and if the headroom above the actor and the shot size doesn’t allow for the microphone boom to be above or below then the microphone can be at the frame edge, to the side, just out of shot, but with the actor’s voice still within the cardiod shape of the microphone’s characteristic. Rifle mics have a shallower, narrow recording characteristic, allowing them to be used at a greater distance from the subject than a rifle microphone, and these can be used if there are external noises nearby that need to be excluded from the recording: clothes rustling, footsteps while walking.

Filming on a sound stage will be planned, designed and controlled to record voice and these are the optimum conditions for sound recording for film during production. In this situation there can be multiple booms, microphones that record to multiple audio channels, via an on-set mixer so that each voice is recorded separately and this sound can by then mixed latter, and if necessary, on set, there can be the use of radio microphones with close microphone work for all the actors in a multi-purpose scene. The process of filming for drama with lights, camera sound should be thought of as equal parts, each requires a skill and standard of work to be suitable for a continuity narrative film. What should be achieved on a sound stage is extremely well recorded voice and nothing else in the filming of a shot. All other sounds can be added later.

On location filming the sound recording will try to reproduce the quality of a film studio space, reducing extraneous noise, using close microphone work and avoiding the noises at the location. All of this is difficult to achieve in an actual location, so there are two choices for recording sound which are primarily based on budget and the two different production methods this enables. There is the post-production process of looping, dubbing, ADR, automatic dialogue replacement, where in separate sound recording for the scene actors repeat their speeches in the film in a sound studio, and this new sound is synced with the shots that were filmed previously. Here poor sound recorded on location can be replaced with ADR and then there will be a sound recording of the shot which has just the voice.

In a low budget production that has no access to ADR and the production sound made during filming has to be the basis for the finished film then locations have to be chosen where there is a chance for the clear sound recording of voice, and/or the time for the filming is chosen and scheduled when the location is quietest, and then other shots in the editing of scene that show the sources of sound at the location are excluded from the framing. So, a scene in an airport terminal might show the main character in the busy terminal with lots of noise, but when they stop to talk to someone on their phone this is filmed when the terminal is empty, but has the same light, or the filming takes place at another location with a background that can be matched to the terminal is chosen: the character stops to talk in front of a wall at the airport, and another location has a matching wall, but is a quiet space that is suitable for recording the sound of speech.

If a noisy location must be used for filming a scene then the shots with dialogue will be closely framed, so that a boom mic can be very close to the speaker, 10 to 20 cm away. The aim here is to ensure that the signal to noise ratio of the recording, the sound level of the voice to the background, will be such that the voice is dominant and the background sounds are recorded at a very low level so that sound track with voice and background noise will be suitable for the finished film. In these circumstances when the production sound has to be recorded at a noisy location, then filming has to be simplified, with a single shot for the dialogue scene, or a close up for each character. If this is not done, and a range of different angles are used then the sound levels on each shot in the edit will be different and background levels will jump when the shots are edited. Also, at the location a wild track will be recorded, recording the sound of the noisy location as a separate recording when there is no dialogue, so that this can be used to patch sound between shots: giving a sense in the narration of the film of a continuous scene at a single location. This is not the ideal practice for sound recording and is to be avoided.

The building of sound stages, the use of film sets over location filming, the control of location, the choices of shots, the careful placing of microphones, the use of ADR, the long term training and skills of actors, the direction of actors, the professional practices on set, are all in place so that voice, and just the voice, is recorded, and this is done so that in the narration of the story, what is said is not just clear, but the recording will be the voice as the characters will perceive it and this is how the audience hears the dialogue. This sound recording can be so close that breathing is heard and in a taut drama, such as a horror film, or a tense emotional, psychological drama, then this breathing, the interiority it presents, will give as much drama to the scene as music or camera angle or physical performance. Hearing the voice means that how it is said, the meaning in the sound of the voice is carried to the audience.

When voice is well recorded during production or through ADR this makes possible and supports editing, it allows for continuity editing, and for dramatic editing. This is because with no sound overlap in speeches then any shot can be chosen and used in the editing and the scene can be cut for dramatic purposes: showing the person speaking when this is best for the scene, showing the person listening, showing the person reacting. If a scene is poorly recorded for sound and there is no ADR then editing will need to use shots where the sound can be matched. If actors talked over each other during speeches then this will severely limit the possibilities for editing. The control of the sound in filming is not just for audibility it is directorial control for the drama of a scene: a well framed shot is effectively redundant if the sound is unusable.

Sound is the second major element of narration in film, and sound in this sense means specific sound effects, SFX, matching or indicating specific things and atmospheres being a more general sense of sounds in the environment of the scene. Both of these support realism, but they are also expressive. The tapping of a finger on a table, can be just a background sound, effectively unheard by the audience, or the tapping can indicate a tension for the story action of the character, or even become a tempo, musical-like element of the scene. These are directorial choices for the sound, which sound aspects of the scene will be given prominence, and often how the characters will perceive the sound is a key factor: the character hears the creaking of a chair, because they are waiting for a reaction from the person sitting in the chair.

SFX are used to increase realism, so actions will have enhanced sound, sound that is more specific and greater than in actual life: here foley sound, the SFX recorded separately to production sound will be used to ensure that the sound is realistic, a door locking will have sound to indicate a door locking, the rev of a car will have a growl or a purr. In poor sound recording during the production there will be a bustle of sound, different sounds with little control over they volume levels or perspective, but SFX will be layered into the scene and support the story action rather than confuse and undermine it. Atmos can be the sound of the environment, by a river, next to a road, in the city, but sound will now also extend to musical form: notes or tones, non instrumental sounds are played in a scene to create a mood, an emotion. The attention of the audience in a scene will be on the visual action, on following the narration but a tone in the scene will indicate how the scene is meant to be read: its a strong moment, its uncertain, there is hope, there is confusion. The concept of naturalistic sound is only one type of realist sound. Atmos can change in a scene, grow, diminish, end at a particular point: orchestrating the reading of the scene and emotion of the drama.

In contemporary sound, there is multi-channel sound, with sound during screenings moving from mono, single speaker and stereo, two channel sound to 5:1 sound, and 7:1 sound. The multiple channel sound allows for the sound to be placed and for the listening experience to be spatial: as in life sound comes from different places, attracting attention, shifting, giving verisimilitude to sound. Voice will come from the speakers behind the cinema screen so that a character’s voice is connected to their image, unless a character is off screen and calling out and then the voice will be placed on a speaker to the side or behind the audience. This surround sound creates a more dynamic and immersive experience, giving the sound an even greater importance in the narration and drama of the story over earlier mono or stereo sound.

There are occasions when sound is deliberately and significantly shifted, so that silence and only a limited choice of sound FX are used. Atmos is lost, and the sound is reduced to specific objects being foregrounded. This is done to present the perceptual experience of a character in a film: in the moments before a car crash the sound drops and all that remains is the sound of the engines screeching trying to stop. This heightened sound reproduces the change that people experience at moments of shock and extreme stress: attention becomes focused and general noise drops. This attention in sound offers emotions: fear, panic, violence, jealousy, hatred, desire, the representation of a character’s interiority. Often silence on a shot, and on the character reacting will introduce this change to psychological sound and then what the character sees is shown with heightened attention of limited sound.

Sound FX can also control a sense of time period and setting, because any anachronistic sound recorded during film are discarded and SFX are used with a sense of period: music playing through a loudspeaker in 1940’s scene will sound different to the sound of music heard through contemporary headphones. The way that a room is dressed changes how a sound is carried: hard concrete floors, creaky wooden boards, soft carpet, vinyl tiles. Poor sound in a film, the wrong SFX will undermine or can even completely undermined the realism of the narrative, effectively suspending the story while the audience try to work out what is wrong. Image will convey story, setting, period, but the sound is what coheres and reinforces the realism and the drama. From sound having the production roles of sound engineers, operating the sound recording equipment and with boom operators controlling microphones there is now a sound designer as an essential part of narrative film production, and they design sound not just for setting, and period, and audibility, but for story.

Music is the third element of sound and because music is social and cultural it sets a place and period, Music from the time can be used or created, composed and played to suggest the historical setting of the narrative. Stories in alternative worlds, dystopian worlds, future worlds need a soundtrack to present this. Music represents the setting and it enhances drama, not just by having exciting music on an exciting scene and sober music on a sad scene, but by being specific to what the audience feel about the characters in the story. A character or a group in the story will often have a musical theme or motif and this will construct the tone of the story and the emotions of the characters. Often scenes that begin with no speech will set a musical tone by showing the characters with music playing: there may seem to be no story action at this point in the film, but actually the story is progressing because what the character is feeling at that moment in the narrative is being depicted. Music can and often is precisely composed and choreographed for the edited film, and without this score much of the sense of character and events would be reduced or lost.

The practice, starting in the late sixties to use pre-recorded popular music tracks may have led to a sense that a musical track adds a flavour to the action, setting the period and a style, but found music, that which pre-exists rather than being composed for the film, can be well used or badly used. Ton be well used it needs to connect to the specific characters and actions in the story. In a music video which presents little plotting, limited narrative there will be a song that is enhanced by the visuals, giving imagery to the music, but the narrative progression, the story will be minimal and the sense of specific characters and plotted story are not presented in the music: the song plays and the images continue. There are significant differences between the form of the music video and narrative film form using music. In some films the music dominates and defines the drama, making the narrative an operatic experience, the speech and actions of the story will be realistic, but the emotion of the action is set by the music.

In film theory sound is identified as diegetic, sound that is within the story world, part of the narrative environment or non-diegetic, outside of the story world. The music score is non-diegetic. This seems to separate music from storytelling, music from narration, and music from characters. Music in key to the narration of film narrative so placing it theoretically outside of the story is somewhat confusing for an understanding of storytelling in film.

The narrative use of sound in a film can seem less significant than it is, perhaps because the moving image seems to be predominant and because film screenplays and film plots don’t appear to plan for or rely on sound for their narration. When watching a film, voice seems natural, sound FX seem real, music seems to match the story action, so the specifics of sound aren’t considered in any detail. This is a situation where the successful use of the sound is simply accepted by the audience. For the film makers sound need to be controlled, designed, directed and is a key facet in the narration, the action, place, time, the emotion, the experience and the understanding of the story.