Music Dream Visualizer (MDV)

Contributors: Yifeng Liu, Hanzuo Liu, Rui Luo (IIIS, Tsinghua University)

Introduction

Welcome to music dream visualizer! This is a LLM course project built by three undergraduate students, which explores the potential of LLM-based song generating. In this project, we tried to compose a song by just offering one prompt to the pipeline.

The whole pipeline looks like:

Get Started

To get started, you need to set the environment:

mkdir /root/project_output
sudo apt-get update
sudo apt-get install -y libsndfile1
conda create -n your_env_name python=3.10.13
pip install -r requirements.txt

You should make your directory structure look like:

|--root
    |--requirements.txt
    |--chat.py
    |--chat_gradio.py
    |--project_output
        |--gen_images
            |--0.jpg
            |--1.jpg
            |--2.jpg
        |--text_images
            |--0.jpg
            |--1.jpg
            |--2.jpg
        |--output.txt
        |--lyric_output.txt
        |--pitch_output.txt
        |--time_output.txt
        |--output.mp4
        |--final.mp4

Files in the project_output directory are generated during the inference.

To run the pipeline, you also need an openai api-key, SongComposer model from Mar2Ding/songcomposer_sft on huggingface, and DiffSinger model from https://github.com/MoonInTheRiver/DiffSinger/tree/master (SVS version, Opencpop dataset, link B). If you run this on the server of professor Xu, these two models are already loaded in /ssdshare/.

Run The Pipeline

If you run chat.py in conda environment, you need:

python chat.py

And you may want to modify the prompt in line 532.

If you run chat_gradio.py in conda environment, you need:

python chat_gradio.py

Then you are able to access the interface on http://127.0.0.1:7860

Some Details

Our current pipeline only support generating Chinese songs.
To ensure that the songcomposer model can run stably, you should tell the gpt to create no more than 10 sentences for lyrics in the prompt.
Two demos are provided in the resources directory.
Setting environment may cost about 30 minutes ~ 1 hour (depends on your speed), asn each inference takes 3~5 minutes, thus if it takes much time doing some steps, you can see output to find out which part goes wrong
DiffSinger is not quite stable, it still has some probability to goes wrong (such as adding some unexpected words -- which may cause wrong output length)

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
__pycache__		__pycache__
resources		resources
.gitignore		.gitignore
README.md		README.md
chat.py		chat.py
chat_gradio.py		chat_gradio.py
project.ipynb		project.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Music Dream Visualizer (MDV)

Contributors: Yifeng Liu, Hanzuo Liu, Rui Luo (IIIS, Tsinghua University)

Introduction

Get Started

Run The Pipeline

Some Details

About

Releases

Packages

Contributors 3

Languages

liuhanzuo/llm_project

Folders and files

Latest commit

History

Repository files navigation

Music Dream Visualizer (MDV)

Contributors: Yifeng Liu, Hanzuo Liu, Rui Luo (IIIS, Tsinghua University)

Introduction

Get Started

Run The Pipeline

Some Details

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages