You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Firstly, this issue straddles mir_eval and jams, (+ @craffel@bmcfee ), but I think it fits better here.
Downbeat estimation --marking beats that begin bars / measures-- can be seen as a form of beat tracking. In the wild, there are a few ways in which downbeats have been encoded in plaintext formats: (a) single column of timestamps, in seconds; (b) double columns of timestamps and beat number (e.g. 1-4); or (c), double columns of timestamp and {measure}.{beat} values. Alternatively, JAMS has a two different ways to represent this, either through the beat or position namespace; notably, in the former, specifying a value of 1 indicates that the observed beat is a downbeat, and NaNs (or any other number) are used to represent arbitrary beats.
Having identified two ordered lists of downbeats, the evaluation would cleanly follow beats. However, it's a bit unclear to me where the adapter layer should live in taking a beat annotation (which potentially represents anonymous beats, downbeats, or a mix of both) and slicing out only the relevant observations.
Two options I see:
Add a downbeat.py module to mir_eval, which handles representation wrangling, and wraps the evaluation machinery in beat.py.
Push this data wrangling off as the responsibility of the representation format, e.g. JAMS. Beat and downbeat annotations funnel through mir_eval.beat, and mir_eval doesn't really care which are passed in.
A third point worth mentioning falls somewhere between the two:
3. Revisit beat vs downbeat representations. Separate annotations seem like redundant information (all downbeats should be beats), but overloading a beat annotation isn't necessarily easy to work with.
Having sketched this out, the second option (2) is attractive from an mir_eval standpoint. However it distributes the responsibility over the different formats, which means it's unlikely to reach one good conclusion.
Thoughts? Has anyone already solved this problem and I've missed it in my travels?
The text was updated successfully, but these errors were encountered:
I don't have a lot of bandwidth to discuss this now, but I can chime in with what I see as mir_eval's philosophical approach. For the metrics we implement, we assume there is a standardized format for the annotations (usually MIREX format), and add a loader for that to io. Depending on which of the formats you list we decide is standard, adding an appropriate loader is probably simple thanks to io.load_delimited. Something like a loader which loads in beat values and extracts those beats marked as a downbeat (format (b)) seems reasonable to me. If something is not in that format (e.g. a JAMS file) we leave it up to the user to load things into the appropriate format to pass to the metrics (which is always low-level, e.g. Python and numpy primitives).
If all of the metrics for downbeat are exactly the same as for beat, I don't see any reason to add any additional modules/functionality beyond potentially a different loader. If there are additional metrics, they could go into a new downbeat module.
Thanks, after looking at io, it seems like either a method there (or maybe even in util) is the way to go. The trick is really taking loaded beat data and preserving downbeats.
Curious to hear what others might have to say, but if I have any eureka moments I'll branch and PR. My primary goal in this is to develop a single, reusable solution that has (reasonably) broad buy-in for legitimacy, rather than countless DIY solutions.
Firstly, this issue straddles
mir_eval
andjams
, (+ @craffel @bmcfee ), but I think it fits better here.Downbeat estimation --marking beats that begin bars / measures-- can be seen as a form of beat tracking. In the wild, there are a few ways in which downbeats have been encoded in plaintext formats: (a) single column of timestamps, in seconds; (b) double columns of timestamps and beat number (e.g. 1-4); or (c), double columns of timestamp and
{measure}.{beat}
values. Alternatively, JAMS has a two different ways to represent this, either through thebeat
orposition
namespace; notably, in the former, specifying a value of1
indicates that the observed beat is a downbeat, andNaN
s (or any other number) are used to represent arbitrary beats.Having identified two ordered lists of downbeats, the evaluation would cleanly follow beats. However, it's a bit unclear to me where the adapter layer should live in taking a beat annotation (which potentially represents anonymous beats, downbeats, or a mix of both) and slicing out only the relevant observations.
Two options I see:
downbeat.py
module to mir_eval, which handles representation wrangling, and wraps the evaluation machinery inbeat.py
.mir_eval.beat
, andmir_eval
doesn't really care which are passed in.A third point worth mentioning falls somewhere between the two:
3. Revisit beat vs downbeat representations. Separate annotations seem like redundant information (all downbeats should be beats), but overloading a beat annotation isn't necessarily easy to work with.
Having sketched this out, the second option (2) is attractive from an
mir_eval
standpoint. However it distributes the responsibility over the different formats, which means it's unlikely to reach one good conclusion.Thoughts? Has anyone already solved this problem and I've missed it in my travels?
The text was updated successfully, but these errors were encountered: