Skip to content

Commit

Permalink
Improved Readme (#20)
Browse files Browse the repository at this point in the history
* Added screenshot, updated plugin. Removed parameters.md

* Explained a bit the process of creating modelData files

* Added why not real-time section

* Addressed Tibor comments

* Added link to release

* Typo

* Removed points, typo
  • Loading branch information
DamRsn authored Mar 24, 2023
1 parent bd280bf commit 80beecb
Show file tree
Hide file tree
Showing 3 changed files with 54 additions and 116 deletions.
97 changes: 0 additions & 97 deletions Brainstorm UX UI/Parameters.md

This file was deleted.

Binary file added NeuralNote_UI.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
73 changes: 54 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,20 @@ your favorite Digital Audio Workstation.
- Lightweight and very fast transcription
- Can scale and time quantize transcribed MIDI directly in the plugin

## Install NeuralNote

Download the latest release for your platform [here](https://github.com/DamRsn/NeuralNote/releases) (Windows and Mac (
Universal) supported)!

Currently, only the raw `.vst3`, `.component` (Audio Unit), `.app` and `.exe` (Standalone) files are provided.
Installers will be created soon. In the meantime, you can manually copy the plugin/app file in the appropriate
directory. Also, the code is not yet signed (will be soon), so you might have to authorize the plugin in your security
settings, as it currently comes from an unidentified developer.

## Usage

![UI](NeuralNote_UI.png)

NeuralNote comes as a simple AudioFX plugin (VST3/AU/Standalone app) to be applied on the track to transcribe.

The workflow is very simple:
Expand All @@ -19,26 +33,18 @@ The workflow is very simple:
- The midi transcription instantly appears in the piano roll section. Play with the different settings to adjust it.
- Export the MIDI transcription with a simple drag and drop from the plugin to a MIDI track.

**Watch our presentation video for the Neural Audio Plugin
competition [here](https://www.youtube.com/watch?v=6_MC0_aG_DQ)**.

NeuralNote uses internally the model from Spotify's [basic-pitch](https://github.com/spotify/basic-pitch). See
their [blogpost](https://engineering.atspotify.com/2022/06/meet-basic-pitch/)
and [paper](https://arxiv.org/abs/2203.09893) for more information.

In NeuralNote, basic-pitch is run
and [paper](https://arxiv.org/abs/2203.09893) for more information. In NeuralNote, basic-pitch is run
using [RTNeural](https://github.com/jatinchowdhury18/RTNeural) for the CNN part
and [ONNXRuntime](https://github.com/microsoft/onnxruntime) for the feature part (Constant-Q transform calculation +
Harmonic Stacking).
As part of this project, [we contributed to RTNeural](https://github.com/jatinchowdhury18/RTNeural/pull/89) to add 2D
convolution support.

## Install and use the plugin

To simply install and start to use the plugin right away, download the latest release for your platform! (Windows and
Mac (Universal) supported)

Currently, only the .vst3, .component (Audio Unit), .app and .exe files are provided. Installers will be created soon.
Also, the code is not yet signed (will be soon), so you might have to authorize the plugin in your security settings, as
it currently comes from an unidentified developer.

## Build from source

Use this when cloning:
Expand All @@ -65,23 +71,40 @@ with [ort-builder](https://github.com/olilarkin/ort-builder)) before calling CMa

#### IDEs

Once the build script corresponding as been executed at least once, you can load this project in your favorite IDE
Once the build script has been executed at least once, you can load this project in your favorite IDE
(CLion/Visual Studio/VSCode/etc) and click 'build' for one of the targets.

## Reuse code from NeuralNote’s transcription engine

All the code to perform the transcription is in `Lib/Model` and all the model weights are in `Lib/ModelData/`. Feel free
to use only this part of the code in your own project! We'll try to isolate it more from the rest of the repo in the
future and make it a library.

The code to generate the files in `Lib/ModelData/` is not currently available as it required a lot of manual operations.
But here's a description of the process we followed to create those files:

- `features_model.onnx` was generated by converting a keras model containing only the CQT + Harmonic Stacking part of
the full basic-pitch graph using `tf2onnx` (with manually added weights for batch normalization).
- the `.json` files containing the weights of the basic-pitch cnn were generated from the tensorflow-js model available
in the [basic-pitch-tf repository](https://github.com/spotify/basic-pitch-ts), then converted to onnx with `tf2onnx`.
Finally, the weights were gathered manually to `.npy` thanks to [Netron](https://netron.app/) and finally applied to a
split keras model created with [basic-pitch](https://github.com/spotify/basic-pitch) code.

The original basic-pitch CNN was split in 4 sequential models wired together, so they can be run with RTNeural.

## Roadmap

- Improve stability.
- Save plugin internal state properly, so it can be loaded back when reentering a session.
- Improve stability
- Save plugin internal state properly, so it can be loaded back when reentering a session
- Add tooltips
- Build a simple synth in the plugin so that one can listen to the transcription while playing with the settings, before
export.
- Allow pitch bends on non-overlapping parts of overlapping notes.
export
- Allow pitch bends on non-overlapping parts of overlapping notes
- Support transcription of mp3 files

## Bug reports and feature requests

If you have any request/suggestion concerning the plugin or encounter a bug, please fill a Github issue, we'll
do our best to address it.
If you have any request/suggestion concerning the plugin or encounter a bug, please file a GitHub issue.

## Contributing

Expand All @@ -103,6 +126,18 @@ Here's a list of all the third party libraries used in NeuralNote and the licens
- [basic-pitch](https://github.com/spotify/basic-pitch) (Apache-2.0 license)
- [basic-pitch-ts](https://github.com/spotify/basic-pitch-ts) (Apache-2.0 license)

## Could NeuralNote transcribe audio in real-time?

Unfortunately no and this for a few reasons:

- Basic Pitch uses the Constant-Q transform (CQT) as input feature. The CQT requires really long audio chunks (> 1s) to
get amplitudes for the lowest frequency bins. This makes the latency too high to have real-time transcription.
- The basic pitch CNN has an additional latency of approximately 120ms.
- Very few DAWs support audio input/MIDI output plugins as far as I know. This is partially why NeuralNote is an
Audio FX plugin (audio-to-audio) and that MIDI is exported via drag and drop.

But if you have ideas please share!

## Credits

NeuralNote was developed by [Damien Ronssin](https://github.com/DamRsn) and [Tibor Vass](https://github.com/tiborvass).
Expand Down

0 comments on commit 80beecb

Please sign in to comment.