Hannah Wendling, Shiv Khanna, and Jordan Berke
View from center stage
View from back-right of room
Looking up into ceiling from stage
Wooden Diffusers (view from side/front)
Wooden Diffusers (view looking into bottom)
Barnum 008 is located in the lower level of Barnum Hall. The space is generally used as a lecture hall and a performance hall for spoken word and choral performances. Typical shows consist of 4-10 performers on the stage at a time. Only choral performances use lapel microphones to boost volume. The room is relatively well designed, creating a room with no echo as well as being dry enough that the speaker only hears themselves as opposed to background noise and side conversations, for a capacity of ~250 people and has 210 audience seats. The room is set up in a fan shape, with a shallow stage and angled walls out to the audience. From the 12” tall stage, the audience rises 11 levels to a height of 7.5 ft.
Since the room is primarily used for spoken word performances, the room needs to have more absorption than a standard performance space. This practice prevents the audience from hearing echoes and creates a higher clarity for the space. This is accomplished by having the majority of the room covered in absorptive material. The floor being carpet beneath upholstered chairs ensures that the sound waves are absorbed after being generated rather than persisting longer than desirable. Additionally, the ceiling of the room is relatively transparent with an estimated 2-3 ft of airspace above. This feature heavily contributes to the room's overall absorption.
A series of angled wooden slats are attached to the left and right side walls of the room. The slats themselves are at a 15 degree angle and are 13.5 inches extended outward from the side walls. The angle has the flat face of each slat facing towards the stage meaning that any specular reflections coming off of this surface return to the source location. This helps with spoken word performances so that in a choir-like environment each singer can hear those around them. On top of this, the change in profile of the side wall creates a sound diffuser. This means that the reflections happening from in between the slats will be scattered around the room increasing the total amount of pressure waves reaching the back row. This should increase clarity for a sound receiver seated in the back of the space.
Top View
Cross-Sectional View
The table (shown to the left) shows the materials of each of the surfaces in Barnum 008. From this it is easy to see how much of the room's surfaces are very absorptive, which will contribute to the small reverberation times calculated.
Impulse Response Measurements
The impulse response used for this room characterization was completed by recording the sound of a balloon popping. The balloons were used as it is a relatively omnidirectional sound source, and they were popped from the front-middle of the stage. The sound was recorded using an omnidirectional microphone from two receiver locations: first in the front-middle of the audience and the second in the back-middle of the audience. The sound recordings were then exported via Audacity to MATLAB, where a series of calculations were completed to determine the reverberation time, clarity, and bass ratio of the space.
Sabine Equation RT60
The RT60 using the Sabine equation was calculated using the room's overall volume, the surface areas of each surface/material, and the absorption coefficients of each material. Manual measurements were taken in the space using a tape measure. An approximate re-creation of the room was then created in SolidWorks. The appropriate surface areas and volumes were then found using SolidWorks' measuring tool. Next, absorption coefficients were found for each kind of material in the room. Some materials' absorption coefficients were approximated based on their similarity to other materials. The reverberation time for each octave band was then calculated using the Sabine equation and tabulated.
Critical Listening
To qualitatively characterize the space, a critical listening exercise was completed to compare the listening experience from different locations in the audience and from a speaker facing different directions on the stage. The source and receivers were located in the same positions as for the impulse response measurements. This exercise best represented the use of the room, as most performances are of spoken word, such as academic lectures and comedy sketches. During such performances, the speaker(s) may not always speak directly to the audience, and may instead face the back or side of the stage, which would create a different listening experience. Additionally, it was hypothesized that a receiver sitting closer to the reflective surfaces surrounding the stage would be able to hear the source “better” than a receiver sitting in the back of the auditorium where there are much more absorptive surfaces.
Auralization with CATT & TUCT
In order to create auralizations of this space, 3D models of the room and its interior needed to be constructed. Using SketchUp, a 3D drawing software, a basic model of Barnum 008 was constructed. When using the Sketch-Up to CATT-Acoustic converter, detailed geometries of the room can be simplified using absorption and scattering coefficients as opposed to complex shapes. Each surface is labeled with its corresponding material and the model is transferred into CATT-Acoustic. Once in CATT-Acoustic, the source and receiver type, location, and directivity are defined, as well as the material properties.
A Sabine RT60 is then calculated by CATT-Acoustic. From there, the model was then brought into TUCT to perform several simulations to develop auralization renderings of the different variations of the room. These auralizations can be created using different types of algorithms for handling reflections and use ray tracing to simulate how sound behaves in the room. Algorithm 1 involves a short calculation and provides a very basic auralization. Algorithm 2 is more appropriate for auralizations, as it is a more in-depth calculation which includes rays split-up for all reflection orders. Algorithm 2 tends to provide more reliable data so it was selected. Algorithm 2 was chosen with 100,000 rays/cones and a 2,500 millisecond impulse generation. After processing, 6 auralizations were created, as well as a data table giving every measurable quantity of sound for each source and receiver pair.
The sound sources (balloon popping, voice), represented by 'S', were located on the stage in the middle. The two receivers (omnidirectional microphone, humans), represented by 'R1' and 'R2', were located in the front-middle of the audience and back-middle of the audience).
Receiver 1 (front-middle of audience):
Receiver 2 (back-middle of audience):
The results of the critical listening exercise support the initial hypothesis that the listening experience would vary greatly depending on where the receiver was located, as well as which direction the source was facing. Music was played from a speaker located at the same source location as the oral test.
Receiver 1:
Source facing the audience:
Sound came across clear and crisp. Once a sentence had been spoken and completed, the next came across equally as clear and intelligible. There was no perceived echo. Music was clear and intelligible. Different instruments could be isolated from one another.
Source facing the side of the stage:
The sound was mostly clear but slightly less intelligible. The curtains on the sides of the stage likely made the spoken words harder to distinguish, but they still could be heard clearly. Music was clear and intelligible. Different instruments could be isolated from one another.
Receiver 2:
Source facing the audience:
The sound came across faint but intelligible. The overall volume of the sound was much lower, but one should be able to make out the words being spoken. Music was clear and intelligible. Sound from each instrument presented as one disabling the ability to tell one instrument from another.
Source facing the side of the stage:
The sound was faint and unintelligible. Not only was the sound barely audible, but it was nearly impossible to distinguish unique words without further listening effort. Additionally, it became harder to pin-point the exact location of the source when the receiver looked away from the stage. Music was slightly less clear but still intelligible. Sound from each instrument presented as one disabling the ability to tell one instrument from another.
Measured vs Estimated Reverberation Times
The reverberation times measured by the impulse response and those estimated using the Sabine equation compare relatively well. The reverberation times for mid- and high-frequencies are within ~0.05 - 0.1 seconds from each other, comparatively. The times for the lower frequency bands were much larger when found using the Sabine equation than from the impulse response measurement. This may suggest that the materials in the room are more absorptive at lower frequencies than expected. To better align these reverberation times, the absorption coefficients for lower frequencies have been modified and are shown below.
Quantitative Room Acoustics Parameters vs Qualitative Listening Experience
The calculated room parameters align very well with the qualitative listening experience in the room. From the very first time a speaker spoke, it was obvious the room had small reverberation times. Any sound came across rather deadened by the presence of all the absorptive material. From an initial perception, Barnum 008 sounded very similar to a space like Nelson Auditorium (where class is located). This experience correlates well with the found reverberation times.
All of the C50 and C80 clarity calculations showed positive levels. This suggests there is more early sound energy in the space rather than late sound energy. This result correlates well with the qualitative listening experience in the room, as speech always came across very clear and coherently. This does not bode well for music however, as little to no late reflections will take away from a musical piece. This lines up rather well with what was perceived in the qualitative listening experience as from the second receiver location, the music presented as one mixture of sound as opposed to each individual instrument playing in unison.
The bass ratios calculated from the impulse response measurement have slightly less correlation with the perceived listening experience. The bass ratio was calculated to be >1, which suggests that there is more persistence of lower frequency sounds than higher frequency sounds. As the room mostly experiences spoken word rather than musical performances, this is an acceptable feature of the space. The lower frequency sounds are likely of lower volume and as such are less perceptible to human listening. As a result, the room did not have a very noticeable difference in the persistence of different frequency ranges, and instead sounded rather even across the full range.
Additional Quantitative Parameter to Better Correlate Perceptual Listening Experience
One aspect of the listening experience that was not well described by the calculated parameters is listener envelopment. Due to the shape of the space and the materials surrounding the stage and the audience (i.e. fan-shaped room with lots of absorptive materials), there was a noticeable difference in listening experience depending on where the receiver was located as well as which direction the source was facing. Late lateral sound level should be calculated to better correlate with listener envelopment. This metric is calculated by finding the logarithmic ratio between the late lateral energy (after 80 milliseconds) and the total energy. The over absorption of the room effectively diminishes any late reflections. The lack of late lateral sound greatly reduces overall listener envelopment and thus the experience of having to listen to a performance in this space.
As previously discussed, the most noticeable downfall of the space is that it is sometimes difficult to hear a speaker on the stage from the back of the room. This is likely caused by the ample amounts of absorptive material in the room and, more specifically, the absorptive ceiling above the stage. Previously, the proposed change to the room was to change the material of the ceiling above the stage to be reflective rather than absorptive. This would help a speaker project by having more sound be reflected out into the audience instead of being absorbed into the ceiling. To represent the effectiveness of this change, the Sabine RT60 has been recalculated with the surface area of the ceiling above the stage changed to a reflective material, such as drywall, shown in the table below.
Using this previously proposed change as a starting point, new changes are proposed for Barnum 008 to improve the listening experience for both speech and music. The listening experience would be considered improved if sound was perceived as louder & fuller and if the room felt brighter & less dry. To achieve this goal, the room's reverberation times should be increased and clarity may be reduced (as it is currently measured to be rather high for the existing space). There are a couple different ways of accomplishing this, some of which include increasing the room's overall volume, adding or changing the geometry to intentionally direct/reflect sound, and replacing absorptive material. The potential changes will be further discussed and analyzed in the following section, 'Acoustical Simulation.'
Three models were created to test how well these proposed changes would work for improving listening experience for both speech and musical performances. Version 1 of the model replicated the original space and is used as a baseline. Version 2 of the model includes the same materials as Version 1, but the ceiling has been vaulted. Version 3 is the same size and shape as Version 1, but absorptive material has been removed above the stage and in the back of the room. The 3D renderings built in SketchUp and imported into CATT-Acoustic are shown in the images below.
Version 1: Original Space
Version 2: Vaulted Ceiling
Version 3: Removed Absorptive Material
The original space has been found to be extremely dry due to the amount of absorptive material in the room. The primary absorptive materials are the ceiling, colored in yellow, and the foam panels, colored in red.
In Version 2, the geometry of the ceiling has been changed. This has increased the overall surface area of the ceiling, but also increased the volume of the room. In Version 3, the back walls have been stripped of their foam panels and the ceiling above the stage has been changed to drywall (colored blue). This significantly reduces the amount of absorptive material in the room and should promote more sound reflection around the room.
The intended result of these changes is to increase reverberation times and create a louder and fuller listening experience in the whole room, but specifically improve the listening experience for those seated in the back of the room.
For this model, each version will be tested by simulated the acoustical response of two different sources at two different receivers. This will create twelve auralizations that can be used to determine the success of the design changes.
As Barnum 008 is most commonly used as a lecture hall, Source 1 (A1) is simulated as a single male speaker speaking directly towards the audience from the middle of the stage at about head height. While the room is not frequently used a music performance space, the proposed design changes aim to improve musical listening experiences. As a result, Source 2 (A2) is simulated as a solo jazz pianist. Both sources are positioned in the same location for each model.
Receiver 1 (01) is positioned in the middle-front of the audience facing the stage. Receiver 2 (02) is positioned in the middle-back of the audience and is also facing the stage. Each receiver is positioned in the same location for each version of the model. The following images show the locations of the sources and each receiver.
Top View of Version 1 (locations apply to all three versions)
Side Views of Version 1 & 3 (top) and Version 2 (bottom)
The following graphs show selected parameters calculated from the original measured room impulse response and those found from the simulated impulse responses for each model. The impulse responses for the models were simulated using an omnidirectional source in the same location as Source 1 and Source 2. The results are analyzed with the qualitative results of the proceeding auralizations to determine how effective Versions 2 & 3 of the model are at achieving the desired room response and listening experience.
EDT was calculated as it represents the perceived reverberation of the room. T20 was calculated to be used as a reference to compare to other spaces with known T20s. These two decay times will be helpful for analyzing the quantitative success of the proposed changes. Longer reverberation times suggest a brighter and potentially louder room. C50 and C80 indicate the perceived clarity of speech and music, respectively. These parameters are important for determining the qualitative success of the proposed changes, as higher positive clarity levels suggest more clear speech and music.
EDT shows perceived reverberation times with early reflections considered.
T20 calculates the reverberation times in each octave band.
C50 indicates perceived clarity for speech.
C80 indicates perceived clarity for music.
Auralization is the process of rendering audible, by physical or mathematical modeling, the sound field of a source in a space, in such a way as to simulate the binaural listening experience at a given position in the modeled space. Its three key components are impulse response (measured or simulated), convolution of IR with anechoic audio, and spatial rendering of convolution. These auralizations were created by convoluting in MATLAB the impulse responses generated by TUCT with anechoic recordings of male speech and solo jazz piano. Below are the auralizations of the models. Each version of the model was rendered with each source and receiver combination.
Version 1
Version 2
Version 3
Version 1
Version 2
Version 3
Version 1
Version 2
Version 3
Version 1
Version 2
Version 3
Impact of Design Changes
The calculated quantitative parameters suggest the design changes have the desired impact. The reverberation times (both EDT and T20) have increased from the existing space and original version of the model (Version 1). Additionally, C50 and C80 have decreased from the existing space and original version of the model.
As expected, the auralizations from the back of the room feel farther away from the ones at the front. Distance aside, however, the design changes moved Barnum 008 closer toward the desired goal. From comparing V1 to V2 and V3, the new versions sound more full-bodied. V3 has a closer presence than V2, likely due to the added volume from the vaulted ceiling. In this way, V3 maintains the same loudness and perceived presence of V1 while adding a richer tonal and musical liveliness.
Degree of Success & Recommended Design
After analyzing the results of each of the models, the design changes in Version 3 created the highest degree of success, and as such this is the recommended design. This version of the model most closely aligns with the goals set at the beginning of the project. Version 2 of the model improved upon Version 1, but from listening to the auralizations, it is clear that it did not achieve the same level of success as Version 3.
Feasibility of Recommended Design Changes
Of the two proposed solutions, there is a stark difference in the feasibility of bringing these ideas to reality. Since Barnum 008 is in the basement of an academic building, it is virtually impossible to open the ceiling and create the vaulted design from this project. The removal of foam absorbers along the back walls, however, is completely reasonable. The foam absorbers are mounted along the back wall and can be removed rather easily without creating any structural damage and at little to no cost. The absorptive material, in theory, could then be attached to an easier to remove mounting system such that for classroom lectures the absorbers can be attached, but for oral performances -- or if the room use expands to musical performance -- the material can be removed to allow the audience a better listening experience.
Successes & Limitations
One of the main successes was the level of detail that was achieved in the simulations. Because of the scale of the run and a successful Sketch-Up model, the simulations were able to be run under the second algorithm in CATT with no errors and valid results. This means the results produced are nearly as "accurate" as the could have been. That being said, the models are limited by the level of accuracy of achieved through the absorption & scattering coefficients, head directions, source directivity, ray/cone numbers, etc. While these parameters were defined with as much precision as possible, there was still a certain amount of inherent error that was included in the models as a results of inaccurate parameter definitions. An additional limitation comes from the auralization. The anechoic recordings were not normalized against each other, so there is a difference in relative volume for each source. Normalizing the anechoic recordings would improve the auralization and provide further insight into the success of the models.
Suggested Improvements in Future Iterations
For future work, improvements may be made in the original model, Version 1, to make more more accurate subsequent models. While there was an attempt to align the Version 1 model with the measured impulse response from the actual space, there was too much human error at play for Version 1 to be truly representative of the space. This can be improved by finding more accurate methods of measuring the room's geometry, creating a more thorough model with better absorptive & scattering coefficients, and defining the material parameters more carefully and accurately.
For the measured impulse response in the actual room, this experiment can be improved in both its efficacy and procedure. Maintaining the same procedure, the efficacy and accuracy can be improved by mounting the microphone instead of holding it in hand. While being held in hand, the microphone would occasionally come into contact with the back of a seat in front or the screen of the laptop. Blocking a potential channel as well as the collision producing sound waves generates some error for the experiment. A more improved result could also be seen by using higher quality equipment and recording software. Higher quality equipment would have noise reduction systems in play that will eliminate any sound being emitted by the operation of electronic devices or footsteps outside the classroom.
In terms of the procedure, better results might be seen by using a targeted sound source instead of an omnidirectional sound source. The primary use of the space does not use an omnidirectional sound source as spoken word is an impulse in a target direction. Using some source targeted towards another object on the stage would better reflect what is happening in the room the majority of its use. The data would then better provide quantitative analysis to what an audience member is actually experiencing.