Yunyi-Liu (Rein)

I’m a recent Phd Graduate from University of Sydney in Electrical and Information Engineering. My Phd thesis is ‘Towards controllable neural audio synthesis: An exploration of sound effects creation with generative models’. My research focus is improving the controllability of neural audio synthesis models for sound effects creation. Currently I’m a research intern at Sony CSL working on Singing Voice Synthesis. Previously, I’m a research intern at Dolby Laboratories in Beijing working on text to object-based spatial audio generation with multi-modal large language models (MMLLMs) from 03.2025 to 06.2025.
My research interests are:
Machine learning, Deep learning, Generative AI, Controllable Generative Models, Audio Signal Processing, Time Series Prediction, Data Analysis, Sound Synthesis and Sound Design, Multi-Modal Large Audio-Language Modeels(MLLM), Human Computer Interaction.
I come from a highly diverse background. Prior to my Phd, I received my B.A. Music and Sound Design at University of Technology Sydney and M.A. Interaction Design and Electronic Arts at University of Sydney. You can access my portfolio prior to 2021, when I was a media artist and sound designer. In 2023, I was a research intern at Dolby Laboratories Advanced Technology Group(ATG) working on DDSP for general audio synthesis and control.
Apart from work, I play and teach the first electronic instrument called the Theremin. You can find my performances here.
Skill sets:
- Programming: Python, PyTorch, Tensorflow, Keras, MATLAB, Linux, SQL, Javascript, HTML, MaxMSP
- Skills: Huggingface, Docker, AWS(S3, EC2), Git, ONNX, Colab, Finetuning (LoRA, PEFT)
- Knowledge: Large Language Models (LLM), Generative Models (GAN, VAE, Diffusion), Text to Audio Generation, Multi-modal Networks, Controllable Content Generation (Latent Space Mapping, Controlnet), Bayes Theorem, Model Distillation, Digital Signal Processing, Spatial Audio Processing, Realtime Audio Processing.