UniSinger: Unified End-to-End Singing Voice Synthesis With Cross-Modality Information Matching
Introduction:
In this work, we propose UniSinger, a unified end-to-end singing voice synthesizer, which integrates three abilities related to singing voice generation: singing voice synthesis (SVS), singing voice conversion (SVC), and singing voice editing (SVE) into a single framework.
All experiments in the paper are conducted on a large-scale singing voice dataset OpenSinger. There are some audio samples to demonstrate the performance of SVS, SVC and SVE applications.
1 Singing Voice Synthesis
index | GT | GT (mel + HiFiGAN) | FastSpeech 2 + HiFiGAN | FastSpeech 2s | VISinger | UniSinger |
---|---|---|---|---|---|---|
#1 | ||||||
#2 | ||||||
#3 | ||||||
#4 | ||||||
#5 |
2 Singing Voice Conversion
2.1 Timbre Conversion
index | Source | Reference | Reference (mel + HiFiGAN) | SpeechFlow | UniSinger |
---|---|---|---|---|---|
#1 | |||||
#2 | |||||
#3 | |||||
#4 | |||||
#5 | |||||
#6 |
2.2 Pitch Conversion
the fundamental frequencies of samples presented below are multiplied by constant factors 0.8.
index | Reference | Reference (mel + HiFiGAN) | SpeechFlow | UniSinger |
---|---|---|---|---|
#1 | ||||
#2 | ||||
#3 | ||||
#4 | ||||
#5 |
4 Singing Voice Editing
Exp. 1:
original lyrics: 爱可以不问对错 ——
insertion: 爱怎么可以不问对错 ——
replacement: 爱怎么(可以)不问对错 —— k e | y i #) b u | w en # d ui | c uo
deletion: 爱(可以)不问对错 —— k e | y i #) b u | w en # d ui | c uo
GT | GT(Mel+PWG) | EditSinger(insertion) | EditSinger(replacement) | EditSinger(deletion) | UniSinger(insertion) | UniSinger(replacement) | UniSinger(deletion) |
---|---|---|---|---|---|---|---|
Exp. 2:
original lyrics: 你何苦非为他等在雨中 ——
insertion: 你何苦非为他傻傻等在雨中 ——
replacement: 你何苦非为他伫立风(等在雨)中 —— d eng # z ai # y u #) zh ong
deletion: 你(何苦非)为他等在雨中 —— h e | k u # f ei |) w ei # t a # d eng # z ai # y u # zh ong
GT | GT(Mel+PWG) | EditSinger(insertion) | EditSinger(replacement) | EditSinger(deletion) | UniSinger(insertion) | UniSinger(replacement) | UniSinger(deletion) |
---|---|---|---|---|---|---|---|
Exp. 3:
original lyrics: 几朵云在阴天忘了该往哪儿走 ——
insertion: 几朵孤独的云在阴天忘了该往哪儿走 ——
replacement: 几片叶(朵云)在阴天忘了该往哪儿走 —— d uo # y un #) z ai # y in | t ian # w ang # l e # g ai # w ang # n a | r # z ou
deletion: 几朵云(在阴天)忘了该往哪儿走 —— z ai # y in | t ian #) w ang # l e # g ai # w ang # n a | r # z ou
GT | GT(Mel+PWG) | EditSinger(insertion) | EditSinger(replacement) | EditSinger(deletion) | UniSinger(insertion) | UniSinger(replacement) | UniSinger(deletion) |
---|---|---|---|---|---|---|---|
Exp. 4:
original lyrics: 被吹进了左耳 ——
insertion: 被思念吹进了左耳 ——
replacement: 被传递到(吹进了)左耳 —— ch ui | j in # l e # ) z uo | er
deletion: 被吹进(了)左耳 —— l e #) z uo | er
GT | GT(Mel+PWG) | EditSinger(insertion) | EditSinger(replacement) | EditSinger(deletion) | UniSinger(insertion) | UniSinger(replacement) | UniSinger(deletion) |
---|---|---|---|---|---|---|---|
Exp. 5:
original lyrics: 在昏暗中的我 ——
insertion: 在那时昏暗中的我 ——
replacement: 在昏暗中与你(的我) —— d e # w o)
deletion: 在昏暗(中)的我 —— zh ong # ) d e # w o
GT | GT(Mel+PWG) | EditSinger(insertion) | EditSinger(replacement) | EditSinger(deletion) | UniSinger(insertion) | UniSinger(replacement) | UniSinger(deletion) |
---|---|---|---|---|---|---|---|