Julia Huckaby, Seth Kelley, Jeremy Newman
DESIGN SUMMARY
ROOM OVERVIEW
Goddard Chapel is a landmark space on the Tufts University campus. The space has a rectangular prism base with a semi-cylindrical ceiling. Key features of the room include wooden panels and ceiling, elaborate stained-glass windows, carpeted stage, and pews, both upholstered and wooden. At the front of the room, there is an organ, with a carpeted stage situated 1ft higher than the rest of the floor, and a large stained-glass window. There are 30 total rows of upholstered pews, which span the majority of the length of the room. More stained-glass windows line the sides of the room. At the back of the room, there is a balcony, which contains an additional 4 pews of seating, however the pews on the balcony are wooden and not upholstered. There is another large stained-glass window at the back of the room, which mirrors the stained-glass window at the front of the room and four smaller stained glass windows below.
The majority of the surfaces in the space are reflective. All of the wood surfaces (ceiling, front and back walls, balcony), brick (lower walls), and windows reflect sound. The only absorptive surfaces in the room are the carpet and the upholstered pews.
Goddard Chapel is a space used for both religious and spiritual services, as well as musical and choral performances. The chapel frequently hosts organ, acapella, choral, and jazz performances. The seating capacity of the space is 300 people, and the stage capacity comfortably fits up to 20 people. The Tufts Jazz Orchestra is one of the largest groups that performs in the space, consisting of 32 musicians. Because of the size of the group, the students are situated both on the stage and in the space between the stage and the first rows of pews. To accommodate sermons or other similar events the chapel has a speaker system onstage to help project the speaker's voice. Most large instrumental performances do not amplify the instruments, but any speaking or singing is typically amplified.
To measure the impulse response of the room two source locations were used, one placed in the front of the right seating section (R1) and the second place in the back of the left seating section (R2). A balloon was popped at the source position (S1) to create a sound impulse that the omnidirectional microphone could then record. In addition to recording sound impulse data a few listening procedures were conducted. A violin was played on the stage while listeners sat in both source positions to take qualitative notes on the room. Additionally to test the speaking performance of the room , one of the team members spoke first at a conversational level and then while projecting their voice so that the listeners at the source positions could again take qualitative notes on the performance of the room.
Using data values from Architectural Acoustics Illustrated, by Ermann and Egan, we identified the absorption coefficients for the materials in the room. While we had to make some predictions of the exact type of material, we used our qualitative listening experiences to help guide the decision. We ultimately decided to identify the wood panels as "wood, varnished over plywood," as the panels seemed very thin. For the stained glass, we chose absorption coefficients for "single pane, heavy and large" glass. The absorption coefficients for brick seem fairly standardized, so there was little discussion about what coefficients to use. We treated the upholstered pews as a block of absorption, since the majority of the pew was lined with fabric. We found absorption coefficients for empty wooden pews, which is what we used for the pews on the balcony. Finally, we identified the carpet to be "1/4" thick, flush to floor." We went through several iterations of the carpet absorption, as our calculated RT values were significantly higher than what we expected based on our listening experiences in the room. Upon further research and guidance from instructors, we landed on this type of carpet, which seems to best support our predicted RT.
We measured dimensions of the room using a tape measure. For the dimensions we could not reach, such as the height of the ceiling, we took a known dimension and estimated the unknown distance. We made some assumptions and simplifications to help with the modeling of the space. One assumption we made was that the juts above the windows had a negligible affect on the room, so we assumed the dimension was just that of the curved ceiling, as if it continued to the base of the window. We also treated the area with the organ as a block of wood paneling up to the pipes, as the pipes are likely acoustically transparent.
Shown above are the T20 and EDT graphs across 6 different octave bands for both microphone locations. Additionally, the tables containing values for T20, EDT, C50, and C80 are shown across these same octave bands as well as the Bass Ratios.
*Note: T20 values for 125 Hz were actually calculated using T15 methods.
The listening experience in Goddard Chapel is much different than one may expect. Looking at the room, we expected for the space to be highly reverberant. The ratio of reflective material to absorptive material makes it seem as though the room would sound very lively and full of reverberance. However, we noticed a sort of "deadness" in the space both when speaking and when playing a musical instrument.
To conduct our investigation of the listening experience in the room, one teammate stood at the front of the room, in the S1 location, while another teammate said in the R1 position and the final teammate in the R2 position. The teammate in the S1 position first spoke at a conversational level, then spoke at a projected level. From the R1 position, both levels of speech were clearly audible and intelligible. At the R2 position, the conversational level of speech was difficult both to hear and understand. The loudness of the speech seemingly decayed by a significant amount from the R1 position to the R2 position. The speaker at position S1 also noted that they felt like they had to significantly raise their voice in order for the person at R2 to be able to hear and understand them well. The overall sound of speech sounded very dry and quiet, and decayed quickly.
When playing a musical instrument, a similar phenomenon was experienced. The person sitting at position R1 had a much more positive experience listening to the violin than the person at position R2. The person at R1 noted a more direct, clearer sound. For both listeners, there was a very noticeable longer decay time for the higher frequencies of the violin than the lower frequencies. This suggests more of the lower frequencies are being absorbed than the higher frequencies. The sound of the violin was described as "bright, cold, and dry" by one of the listeners. This suggests the absence of some of the lower frequencies. Again, we expected the space to be very supportive of the violin, but instead, the violinist had to work harder to have a clear, bright sound by playing louder and more articulated. The space is often used for performances by bigger groups, which would likely work better in the space than unamplified voice or solo violin.
There were several key differences between the measurements collected and the in person listening experience. While listening to a violin being played on the center of the stage, the room sounded bright when the listener was at position R1 near the front of the room, however in the back of the room at position R2 the listeners experience shifted significantly as the sound became less direct and clear. At both listening locations the listeners reported that the violin being played sounded bright, indicating that the room was more reverberant at higher frequencies. However the data collected from both positions indicated that the room is actually more reverberant at the lower frequencies for both listening positions. This difference is likely due the balloon pop failing to create a loud enough noise. This lack of volume meant it was necessary to use a T15 value for the lower frequencies that may have slightly skewed the data to get this result.
The lack of support provided by the room was also noticed when a speaker was talking at the front of the room. While their voice would be clear and direct if the listener was positioned at R1, it was difficult to hear a speaker at listening position R2 even if they projected their voice. This observation can be seen in the data as the decay time for position R1 is longer than the decay time for position R2. This shows how a person in the back will perceive a speaker in the front of the room as much less supported than one in the back.
Using the Sabine equation, we predicted a reverberation time based on type and amount of material in the room. There are some differences between the RT that we found from the impulse response and the RT that the Sabine equation predicted. We attributed some of these difference to some simplifications in geometry and material assumptions. In general, the Sabine equation predicted RT values about half a second larger than the impulse response values. This supports some of our earlier confusions about the difference between the look of the space and the sound of the space. We are guessing that the wood materials are more absorptive than we thought, or maybe the carpeted area is thicker than we predicted. Below is the table we used to compute the RT using the Sabine equation.
Because of the significant difference between the Sabine equation and the impulse response RT estimation, we think looking at the clarity values may be more useful to us. From the tables in the "quantitative results" section, we see that the values of clarity generally increase (both for C50 and C80) and reach a peak value in the 2000Hz octave band. This is good, as human hearing is most sensitive around the 1000Hz frequency. However, there is a big difference between the 2000Hz octave band and the lower octave bands. In order to make speech more intelligible, especially at the R2 position in the back of the room, we hope the changes we make in the next part of the project will increase all of the values of clarity.
As Goddard Chapel is used for both jazz orchestra and vocal ensemble performances, creating a space that works well for both jazz performances, with their high volume, as well as unamplified vocal performances, with their relatively low volume (without amplification), is important. Our goal to improve Goddard as a space for unamplified vocal performances such as a choir or speeches is to improve the support for voices in the space. The issues we identified with the current room setup is that it can be difficult to hear an unsupported voice, especially when sitting in pews farther from the stage, and that speech clarity of the room is perceived to be low. To make Goddard Chapel a better space for unsupported vocal performances and speech, we propose that an orchestra shell should be added to the ceiling of the chapel above the stage to provide better early reflections, improving the support for the speaker and improving speech clarity. If the addition of the orchestra shell is not enough to provide adequate support and clarity to the space we would recommend adding additional reverberation to the room by changing the wood used for the orchestra shell and ceiling to a move reverberant type. With one or both of these improvements we believe that it would be possible to provide better support for unsupported vocal performances in Goddard Chapel.
We hope to verify these changes with the Sabine equation. Currently, we have some discrepancies between the perceived sound of the room and the Sabine RT estimations of the room. The Sabine estimates seem too high for the room, so we are going to experiment with different values of absorption to try and minimize this difference between predicted and perceived values. As aforementioned, we think looking at clarity might be a more useful parameter for us. In general, we hope to slightly raise the RT time, as well as increasing the values of clarity across all of the octave band frequencies. We think making these changes will positively affect the values.
From our analysis, the quantitative results for reverberation definitely seemed to be lower than we expected both from our qualitative results and our absorption modeling of the room. Combined with the fact that we had to use a T15 for the 125 Hz bands due to the low sound floor in the room, we have theorized that the microphone's input level was not high enough. Since the room was very quiet to begin with, the ambient sound level should have been low, and the sound floor should have been low too compared to the balloon pop. If the microphone wasn't picking up as much as sound, then the reverberation time could have been shorter than in actuality due to the decaying sound not being picked up. In the future, we would like to retake sound recordings in the room to ensure those calculations are correct. When using the Sabine equation to calculate the reverberation time, we also had to make assumptions on many of the materials in the room. In the future, we would like to do further investigation to get better accuracy of these materials.
Throughout this project, we gained experience in taking microphone recordings of balloon popping tests for reverberation time, measuring dimensions of a room, and estimating absorptive properties of materials in a room. We also practiced calculating quantitative results such as reverberation time, EDT, clarity, and bass ratio, as well as describing qualitative results from listening exercises.
Most of the limitations of this study come from assumptions that were made due to a lack of knowledge or simplification. This includes things such as the dimensions of the room, simplifying some geometries of the room, and removing some objects in the room from our analysis. This also includes making assumptions for material absorptive properties.
As previously discussed, we decided to alter the shape of the ceiling to increase the early reflections for a speaker or vocal performances on the stage. When physically in the space, we noticed a significant lack of support for an unamplified speaker. By adding a sloped ceiling (which mimics and orchestral shell), we hoped the clarity would increase in the receiver location at the back of the room. While probably unrealistic, we are assuming the sloped ceiling could be retracted, which would make the space as versatile as possible. Our first modification from the original model kept the same material on the ceiling, but included the new geometry. Our second modification kept this new geometry, but included a less absorptive wood material on the ceiling.
The CATT Acoustic Models did not change much between the original and modified models. The source and receiver locations stayed the same for all three simulations, as we wanted to keep the data as uniform as possible. We used the same models for both of the modified cases, just changing the absorption coefficients in the abs_defs geo file.
An auralization has been defined as the "process by rendering audible, by physical or mathematical modeling, the sound field of a source in a space, in such a way as to simulate the binaural listening experience at a given position in the modeled space" (Kleiner, 1993). There are three components to an auralization: impulse response, convolution of IR with anechoic audio, and spatial rendering of convolution. An impulse response is a short burst of energy that spans a large range of frequencies in the human hearing range. The impulse response is convoluted with an anechoic audio, which is then able to simulate what the anechoic audio would sound like in the chosen room.
We used CATT Acoustic for the acoustical simulation. While there are several methods to perform acoustical simulations, this study used TUCT, which is a geometrical method consisting of ray and cone tracing. For each of the following auralization, TUCT algorithm 2 was used, which uses ray split-up for all reflection orders. We used 100,000 rays and a simulation length of 4000ms, which is approximately 3.5 times the amount of the longest reverberation time in the space. We felt that these were appropriate choices to maximize simulation accuracy, while also completing each simulation case in a timely manner.
Impact and Goals Achieved
Quantitatively, the C80 clarity in receiver position 1 dips down between the 250 and 5000Hz octave bands in both modification cases. This makes sense, because we are essentially making the space more reverberant. The receiver in position 1 gets much more direct sound than the receiver in position 2, so adding more reverberation might slightly muddle this clarity. While this is not the result we wanted for the male speech case, it is not difficult to distinguish between words when qualitatively listening to the audio. We think the dip in clarity is minor enough that it doesn't impact the overall speech intelligibility.
The clarity in position 2 stays fairly constant between the original and the two modifications. However, the clarity for both of the modifications is slightly increased in the 1000Hz range, which is the octave band we are most concerned about, since it most closely relates to human speech. While a small change, it is definitely more of what we were hoping to see.
Although clarity was the main parameter we focused on, we also looked at some different reverberation time parameters such as EDT, T15, and T20 times. Across the different room versions and receiver locations, the T15 and T20 times do not vary in any significant ways. When looking at the EDT with receiver position 1, the 1000Hz range has noticeably higher EDTs for the modified rooms. This may contribute to the slight decrease in clarity in this receiver position. However, the EDT with position 2 has the same EDTs for all three room versions at the 1000Hz range. Without the EDT being increased, it makes sense for the clarity in this position to benfit.
Qualitatively, the male speech audio from the receiver 1 position has a slight but noticeable difference between the original model and the modifications. Modification 1 (sloped ceiling) sounds a bit more reverberant than the original and modification 2 (sloped ceiling and less absorptive material) sounds slightly more reverberant than modification 1. The effect of the modifications is easier to hear from receiver position 1 than receiver position 2. We definitely still heard a slight difference between the original and the modifications, but they were less drastic than position 1.
The choir from the receiver 2 position has a much more noticeable change with the modifications. The original model sounds like the choir is singing in a very small space. Their words end very quickly when they finish singing, i.e. it is not very reverberant. In the second modification (sloped ceiling and new material), it sounds like the choir is singing in a much larger, more ethereal place. The sounds linger for a longer amount of time after they stop singing. Modification 2 seems to make the space much more appropriate for this kind of performance. There is also not a significant difference qualitatively for the choir between receiver positions 1 and 2. The clarity seems to not be negatively impacted by this modification. The words are still easy to distinguish, they just sound a bit more cohesive and without a big separation between every word. In both cases (male speaker and choir), we prefer the sound of modification 2.
Feasibility of Recommendations
While one of our assumptions was that our modifications in room geometry could be retracted, realistically, this is not a very feasible solution. In our SketchUp model, we completely altered the geometry of the ceiling, which would be a very costly solution. It would also fundamentally alter the look of the room, which the architects and school administrators would likely not appreciate. If the sloped ceiling was somehow retractable, it could be a much more feasible solution. However, due to the size and slope, it would likely be very difficult to implement. Even more unfeasible is the second modification. In that scenario, both the geometry of the ceiling is changed, as well as all of the material on the ceiling. This change would not be realistic in the slightest, especially given how little a difference it made. However, both proposed solutions were interesting to study, and would have positive, even though small, effects on the space for unamplified vocal speaking and performances. A more feasible solution would be to use amplification, which is more than likely what performers and speakers use now.
Success and Limitations
An auralization is a great tool for acousticians to be able to analyze a space. It allows the ability to hear what different sources sound like without having to physically be in the space. When listening to auralizations in a room calibrated to listen to auralizations, it is a very immersive experience. However, it is very difficult to arrange speakers in those rooms, and you need a highly absorptive room to listen to the auralizations in, so the room itself doesn't impact the auralization. This set-up is difficult to come-by, and for the purposes of this class, we listened to our auralizations primarily over headphones. This is definitely a limitation, as there is a very real presence of in-head localization. This is a phenomenon where it sounds like the room is inside your head, instead of you being in the room. Listening without headphones seems to be a good alternative, but that is also not ideal as the room you're listening in impacts the sound.
Improvements
In future iterations of this project, we would be interested in further investigating why the room sounds so much less reverberant than it looks. This was a significant point of confusion throughout the project. While we currently believe some of the materials are more absorptive than they look, we are curious to see if this is true, or if there is some other factor at play. We think that our recommendations for improvement would be different if we figured out why the room was not very reverberant. Maybe these new recommendations would be more feasible to implement, which would be a plus.
An improvement in the experimental process would be to pop more balloons in the space, which would allow us to gather more impulse response measurements. When we originally popped the balloons, we noticed that the microphone on the podium was on, which picked up pieces of the balloon falling after it popped. While we think this was probably not too important, it would be interesting to take more recordings to confirm. We are also curious what the impulse response would sound like from the balcony. In most performances, the balcony is not open unless there is overflow, so most of the time, there is not an audience there. However, it is a much different receiver location than the two we investigated, which would hopefully be an interesting analysis.