Interfaces for rhythm creation and transformation.

Proposed by Daniel Gomez

14:00-15:30, Monday 13th October, 2014

14 people in attendance.

Description from the wiki:

General practice of rhythm manipulation with computers, either in live or in composition scenarios, has been standardized as the atomic control of onset positions, durations and dynamics. In "real time" scenarios, rhythm control is mostly done either by playing an instrument and recording audio (or MIDI), playing a MIDI keyboard (or pad) or tapping the mouse. On the other hand, "off line" approaches, are mostly symbolic, based on writing scores or editing time grids where onsets, durations and dynamics are written.

Given the current knowledge of rhythm cognition and perception, how can new methodologies to create and transform rhythm emerge? Specifically addressing different strategies from note by note playing or writing, with novel tangible interfaces that go beyond screens.

Notes of discussion (taken by Matthew Davies with some minor additions by Daniel Gómez):

DG (Daniel Gomez): Alternatives to marking onsets beat by beat.

BS (Bill Sethares): much existing software exists, e.g. Liquid Audio, Band in a Box - choose a pattern, generates drum, bass accompaniment.

DG: What are mechanisms behind these systems?

JO (Jaime Oliver): issues with latency and resolution when performing “all the notes”

percussionists can do 2ms - very hard for computational systems

30ms-40ms top latency that is permissible.

Intermodal in audio 2-10ms is the limit.

Need to program anticipation strategies.

GL (George Lewis): Keil’s idea of participatory discrepancy. Jitter is attempt to “model” the inexact nature. Probably an upper bound to precision of really trained musicians like Akshay. Does lack of very tight precision affect ability to predict?

JO: Sampling with a sensor. Maybe digital instruments will never reach the resolution of analog percussion because of the jitter. If you are always latent at the same rate you will never lose the sense of control. The actual frame rate of optical sensors are 80-120 frames per second, so could miss events. Video is the worst resolution for something like this. What kinds of resolutions are available with other sensing strategies? If latency is constant, then can account for this. Difficult if variable.

GL: Marimba Illumina. Piano Bar optical sensors. Sensing begins before key actually strikes. Approach the vanishing point.

JO: Marimba Illumina originally analogue.

GL: Flat roll out keyboard, 4 mallets, optical sensor.

Wright and Wessel

JO: Latency of Mano Controller - tried to measure it. 20-30ms audio buffer in PD. 100 frames per second +/-5ms (jitter part) whether or not in the frame.

Around 40-50ms

BS: Is this why not using percussive sounds? Strategy for “covering” the latency.

JO: Some mappings where latency perception varies. JO perhaps adapted to latency. This why it’s problematic to perform rhythms.

GA (Gerard Assayag): worst case being off by 10ms + audio latency.

JO: Also delay of propagation of sound from the speaker to the ear.

Time from key press on piano to sound being perceived.

GA: Elaine Chew has experiment on adaption to network latency. If latency is steady musicians can develop strategies to play together. As soon as jitter is too big, then it becomes very difficult.

JM (João Menezes): Gesture recognition - leap motion, 200 frames per second. Latency is unknown even by developers. Estimated to be lower than any video based recognition. Gesture instead of video.

JO: Always audio latency. So he avoids having sounds triggered by onsets.

GL: Matt Wright - Trading latency for jitter.

Two methods for removing jitter by adding latency.

SB (Sebastian Bock): people prefer latency to jitter. We can cope with that better.

GL: Entrainment has become a holy grail. All beat databases reported in grantee talks seemed the same to him. There is a class of music which aims to prevent entrainment. What do beat trackers report for this kind of music? A lot of un-beat music out there.

“Non-entrainable rhythms”

AL (Andy Lambert): GFNN model. Andy interested in how generative music can work with GFNN model - to be more expressive and also to explore more rhythmicities including arrhythmic.

Curtis Roads keynote at ICMC/SMC. Rhythm is about generating time. Can react against or react with a beat. Feedback loop - feed output back into GFNN will change resonant frequencies. Can become a chaotic system that still has rules.

GL: Floating section (the alap in Indian Music)

BS: 80% beat tracking accuracy means 20% of the time it’s failing.

SB: His system doesn’t have to report a beat if very low observed probability of beats.

GL: Do beat trackers fail like people do?

Convenient fiction to divide music into music with or without beat.

Interested in technologies that can understand

RR: In order to play the alap, performers understand very well what it is.

GL: How to make a computational model of the temporal feeling underlying the alap? Major feature of Indian music. A lot of music where tempo is not strict. Yes, people can still sense something that’s important.

GA: Shazaam only exactly recognises the piece.

Beat is kind of a paradox. it’s supposed to be regular, but nothing is really regular.

Lots of music composed not to have a strong beat structure, yet a feeling of the beat can still emerge at times.

Difference between MIR system looking for accuracy against ground truth, but different for creative applications, where aim is for the beat to emerge in the music.

Even a weak recognition (of beat) might be sufficient to allow co-adaptation in listeners. So accuracy might not be critical. Real problem is on the generative side

BS: Asks GA to describe rhythm in OMAX at the concert

GA: No beat recognition at all. Builds a sequential model based on pitch or spectral regularities, trying to segment into consistent spectral or timbral units. Try to recreate new sequences respecting the statistics of the performance. Linear model that connects patterns, rhythm is not model explicitly, these tend to emerge from the rhythm of the performer.

Now working on explicitly models for beat extraction in new versions. Using work of Ed Large. Have a flexible / fuzzy beat recognition. Trying to capture pulse feeling even if not well installed. Not trying to measure the beat, but have something like a beat which emerges and use this for matching. More a notion of speed and density than beat. If done properly, audience can still believe the beat is being followed.

JO: OMAX is segmenting input. First elements from the concert had clear durations which brought elements not in the original. Can generate new rhythmic elements.

GA: If performance is in a very precise beat, then this recombination (as used in concert) could break down. For new system, aiming for output to be correct “phase-wise”. Choose jumps which maximize phase correctness. Still needs some micro-adjustment.

GL: Could you use system against the grain. It will still detect something. Can you still use this?

GA: Yes, this can be fine - dependent on application / performance.

SB: Native instruments need very precise tempo for DJ applications. Can still use “inaccurate” systems for other purposes.

GA: Interaction with audience/listeners is what’s important.

GL: of course we want the Beat tracker to be as accurate as possible.

RR: need real-time, causal beat trackers for creative applications. They should be able to react when the metre comes in and out.

GA: Don’t always need hard real-time.

Using Yin for pitch tracking in OMAX, a lot of statistical post-processing of pitches (up to 50ms delay). Playing with memory so don’t need to be accurate to the ms as it happens.

GL: Input not required for Voyager. It can start and will play music. If there is an input, then it will start to incorporate the input. He’d like to have an accurate online beat tracker. He’d want it to respond in some useful way when the beat disappears.

JO: What about picking the alternative metre? e.g not the first choice.

GA: Beat can emerge from co-adaption from human and machine. Beat could be perceived by audience, but it’s not inside the human or the machine. It emerges for the interaction of both. How to favour that kind of emergence?

BS: Rhythmicator - how does it work?

RR: strictly generative. Density and syncopation axes. Exploring this space.

SB: Easy to have a smooth transition.

MD (Matthew Davies): different drum sounds of rhythmicator are not connected. multiple Rhythmicators in parallel.

GL: This is like Voyager. A bit like 64 rhythmicators. But could group them together.

It’s a max for live device. Also max patch version.

JM: multiple could be combined in max.

BS: What else could it work on besides density and syncopation?

MD: Rhythmicator models syncopation by anticipation, not by accenting, so this could be explored.

JN (Jerome Nika): What is a good system to navigate through the memory? For new OMAX?

GL: Free-bop. Rhythmicator does stochastic generation.

JN: Want a temporal specification for generating rhythm.

GA: More rhythm than pulse. Pulse can be evoked by metre.

RR: How to represent metre to be manipulated?

GA: OpenMusic representation of rhythm. Tree structure for rhythm in LISP. Auto-referential. No difference between metre level and leaves of tree (the rhythm). It’s a recursive structure. Easy to manipulate rhythms in this way. Not for interaction.

AL: Ed Large’s model is a representation of metre

RR: How do oscillators represent metre?

AL: They keep track of when they align and when they resonate together. Pick the peak signals.

RR: Good way to track the metre. But need a way to structure the future.

AL/RR: Using a recurrent neural network to represent the future.