NTNU Music and Audio Processing Lab - Results 研究結果展示

Results 研究結果展示

Since the lab has just started, there is no research results that can be shown. Therefore, I will demo two research that I have involved in before joining NTNU: one is related to motif discovery, a typical computational musicology task, the other is related to music style transfer (in terms of both instruments and performing style), a style transfer task.

目前因為實驗室剛開始的關係，並沒有研究結果可以展示。但是，為了具體描述本實驗室的研究方向，我暫時先在這裡展示兩個我先前參與過的研究（在加入台師大以前）：一個與音樂動機發現（即自動辨識音樂當中的動機，為一個典型的計算音樂學的題目）有關，另一個則與音樂風格轉換有關，同時包含了樂器轉換與表演風格的轉換。

Improving Motif Discovery of Symbolic Polyphonic Music with Motif Note Identification

(by Jun-You Wang, Yu-Chia Kuo, and Li Su, accepted at Transaction of ISMIR, 2025)

This work proposes a framework for automatically discovering motif in Western classical music by first utilizing a data-driven motif note identification (MNID) model to remove notes that are not related to music motifs, and then automatically discovers motifs from the remaining notes.

在這個工作當中，我們提出一個自動發掘西方古典音樂當中的音樂動機的框架：首先用一個資料驅動的「動機音符辨識」模型辨識出與音樂動機有關的音符，藉此移除不相關的音符；接著再從這些相關的音符當中自動辨識出音樂動機。

Here is the visualization of the motif discovery result of Beethoven’s Piano Sonata No. 10 in G major, Op. 14, No. 2 (first movement, measure 8--18). The figure was plotted by Prof. Li Su, who was my post-doc mentor. The first row of the figure shows the musical score; the second row shows the ground-truth motif annotation in the pianoroll format; the third row shows a baseline motif discovery method, while the last row shows the proposed motif discovery method. The proposed method successfully discovers the motif d and k, though it misses some of the notes in d.

以下展示我們提出的音樂動機發現模型的成果，以貝多芬的G大調第10號鋼琴奏鳴曲為例。這張圖由蘇黎老師（現為中研院資訊所副所長），即我博士後研究的老闆所畫。圖中的第一排是樂譜，第二排是音樂動機的參考答案，第三排是前人的方法所做出的結果，第四排則是我們的方法做出來的結果。可以發現相較於前人的方法抓出許多不正確的動機，我們的方法精準地辨識出，且只辨識出兩個正確的動機（參考答案中的動機d與k），不過它還是漏掉了一些動機d的音符（第9個小節的地方）。

Both Prof. Li Su and I would like to continue this work. Feel free to join us!

我和蘇黎老師都想要繼續這個工作。歡迎加入我們！

Music2Fail: Transfer Music to Failed Recorder Style

(by Chon In Leong, I-Ling Chung, Kin-Fong Chao, Jun-You Wang, Yi-Hsuan Yang, Jyh-Shing Roger Jang, in Proceedings of APSIPA 2024)

Paper link 論文連結: https://ieeexplore.ieee.org/abstract/document/10849247/

This work proposes a special task of music style transfer: failed recorder style transfer. Basically, we convert a monophonic audio to the style of a "failed recorder", which sounds like this famous meme:

在這篇論文當中，我們提出一個音樂風格轉換的特殊例子：破直笛風格轉換。簡單來說，我們把一段由任何樂器/人聲演奏的單音音樂（即同一時間只有一個音符被演奏）的聲音轉換成破直笛的聲音，聽起來會像以下這個著名的影片一樣：

Although this task sounds weird, it is actually challenging because it involves the conversion of timbre and the performing style. The input audio may be a recording performed by a professional singer, but the output should sound like a completely "failed" recorder performance. The propose of this task challenges the common practice of evaluating music style transfer systems: only transferring the timbre, but not the actual performance style (particularly skill level).

雖然這個問題聽起來很詭異，它實際上是一個非常有挑戰性的題目，因為這個題目要求同時做到兩種轉換：音色上的轉換（從任何樂器的聲音轉換成直笛聲音）以及演奏風格的轉換。輸入的音檔可能是一段由專業歌手演唱的聲音，但輸出的音檔卻需要聽起來像是完全「失敗」的演奏風格（無論是故意為之，或者是真的演奏技巧差勁）。因此，這個題目其實挑戰了傳統上評估音樂風格轉換的情境：通常只要求進行音色上的轉換，而不考慮實際演奏風格（更具體來說是演奏技巧）上的轉換。

Well, at least that's what we claimed. Believe it or not......

呃，至少這是我們宣稱的說法，信不信由你......

Anyway, here are some demo audios of the proposed system, provided by Chon In Leong. The audio in the left hand side is the input audio, which is then converted to the style of failed recorder by the proposed model in the right hand side.

總之，以下是一些我們提出的模型的demo音檔，由梁俊彥（本文第一作者）提供。左邊的音檔為輸入音檔，這個音檔會被我們提出的模型轉換成破直笛的風格，而右邊的音檔則是轉換的結果。

idol_origin.wav

idol_gen.wav

haruhikage_origin.wav

haruhikage_gen.wav

Do the outputs sound like "failed recorder" performance?

你覺得輸出的音檔聽起來像是破直笛的風格嗎？

Actually, this work was originally the final project of Prof. Yi-Hsuan Yang's "深度學習於音樂分析及生成" course at NTU by Leong, Chung, and Chao (in the 2023 Fall semester). I gave them some advice on how to organize and present this work, and also helped them write the paper.

其實，這個工作原本是Leong, Chung, and Chao（本文前三位作者）在台大楊奕軒老師「深度學習於音樂分析及生成」課程的期末project（112-1學期；在這之後每個上學期也都有開課，歡迎各位跨校選修該門課程！）。我給了他們關於如何整理和展示這個工作的建議，同時也幫助他們進行論文寫作。

I believe this is a interesting work that validates the ideas that I have always hoped to promote:

MIR research could be funny and interesting;
Even those weird or less explored topics may actually have research values;
Find one interesting topic; don't hesitate, just do it!

在我看來，這是一個很有意思的工作，它應證了幾個我一直希望表達的想法：

MIR相關研究可以是有趣而好玩的；
即使是那些聽起來奇特或是沒什麼人想過的題目，也可能是值得研究的；
所以，找到一個自己覺得有趣的題目，然後不要猶豫，就開始做吧！

Page updated

Google Sites

Report abuse