Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lastgenre: Fix track-level handling, multi-genre keep, force behaviour, logging #4982

Draft
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

JOJ0
Copy link
Member

@JOJ0 JOJ0 commented Oct 29, 2023

Description

Edit 2023-09: The original idea of this PR was:

Several fixes I had in the queue for months. Some of it required fixes in the library code which are through by now.

  • Fix the force option: Don't always overwrite comma-separated multi-genres, compile a list and keep what's in the whitelist.
  • Fix lastgenre -A in combination with config option source: track - Tracks receive the album's genre even when this option is set
    • When an album-level genre is set already, single tracks should don't fall back to receiving the album's genre.
  • Adjust log-level and message when lastgenre handles tracks to look similar to when handling albums.

Edit 2023-09: Additional option keep_allowed

During review and discussions it turned out that besides the existing force option a second option would be required to really achieve a typical (expected) behaviour of a force option. This is what we came up with (copied over and slightly edited from #4982 (comment)):

Two config options, force and keep_allowed, i.e. 4 possible settings:

Case 1

Overwrite all. Only fresh last.fm genres remain.

force: yes
keep_allowed: no

Case 2

Add new last.fm genres when empty. Present tags stay untouched.

force: no
keep_allowed: no

Case 3

Add new last.fm genres. Keep whitelisted genres in present tags.

force: yes
keep_allowed: yes

Case 4 (default)

Add new last.fm genres when empty. Keep whitelisted genres in present tags.

force: no
keep_allowed: yes

To Do

  • Documentation of new option and new default force behaviour.
  • Documentation: Clarify that -a is implicit
  • Changelog.
  • Fix existing tests.
  • Implement Case 1
  • Implement Case 2
  • Implement Case 3
  • Implement Case 4

@JOJ0 JOJ0 requested a review from sampsyo November 2, 2023 16:27
@JOJ0 JOJ0 marked this pull request as ready for review November 2, 2023 16:27
@JOJ0
Copy link
Member Author

JOJ0 commented Nov 2, 2023

I'd request a review from you @sampsyo since I think you initially created it. Also @rain0r would be good since 5 years ago they added the -A option. Hi @rain0r , you wanna take a look? :-)

In short: I think I fixed the plugin to now really reflect what's documented. Any nitpicking in my code or functionality-wise is appreciated.

One question already. Here we do not state that a -a/--album option exists: https://beets.readthedocs.io/en/latest/plugins/lastgenre.html#running-manually

When I started out with using this plugin I was confused a verry long time about this option. As far as I understand it now: It doesn't do anything since it is default. So why keep it? Or is having a -a option that is the default anyway a common thing in beets? I know we have a lot of -a commands which streamlines usablity, and that is a very good thing! Usuall they change behaviour to not do something with items but with albums. I'm just not sure about this one....do we have such a pattern anywhere else? So, just leave it? Should I add some words to the docs?

I think the both of you decided these options should look like that around here: #3220 (comment)

JOJ0 added a commit to JOJ0/beets that referenced this pull request Nov 2, 2023
@sampsyo
Copy link
Member

sampsyo commented Nov 3, 2023

Thanks for the extra context, @JOJ0!

About the existence of -a (the default mode) specifically: it's not too uncommon… for example, the beet import command has several flags that are opposites of each other, one of which is the default. Of course, it's important in that case because the default mode can be set in the config, so the user needs a way to override the default in either direction. That's not the case here, so maybe it at least makes sense to add "(default)" to the -a option's help string, or to remove it altogether?

Copy link
Member

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping!! Here are a couple of straightforward comments.

beetsplug/lastgenre/__init__.py Outdated Show resolved Hide resolved
beetsplug/lastgenre/__init__.py Outdated Show resolved Hide resolved
beetsplug/lastgenre/__init__.py Outdated Show resolved Hide resolved
@JOJ0 JOJ0 marked this pull request as draft November 8, 2023 08:08
@JOJ0 JOJ0 force-pushed the lastgenre_fixes branch 2 times, most recently from 1e81209 to 89ae925 Compare November 16, 2023 12:33
JOJ0 added a commit to JOJ0/beets that referenced this pull request Nov 16, 2023
@JOJ0
Copy link
Member Author

JOJ0 commented Nov 17, 2023

I'd like to pull out this conversation #4982 (comment) into a new thread, to make it more obvious for others as well. I think it could be a broader discussion of where this plugin should go. Basically we were talking about the current force: no behaviour being weird as well as the new behaviour I am initially proposing with this PR. I gave all this some thought and came up with this idea. Let's discuss it:

So from my point of view, the main problem with the current behaviour when force is disabled, is that it's not really what a user would typically expect. So what could we do to make force: no more predictable?

The following idea would require a new config setting as well as a whole new branch of behaviour (Case 3):

Case 1

force: yes
overwrite all, only fresh last.fm genres remain

Case 2

force: no

keep any string in present genre tag, only write last.fm genres when empty

Case 3

force: yes
keep_allowed: yes

keep present genres when whitelisted and add new last.fm genres (this is a new branch of behaviour and needs to be coded, I think there is open feature requests for it. Update: Something was feature-requested, but it might not be exactly as I'm proposing here: #4750)

Case 4

force: no
keep_allowed: yes

cleanup only - keep present genres when whitelisted but don't add new last.fm genres; Only when genre is empty, add last.fm genres.

That last combination is weird though....but it's what I proposed for force:no before!

Which of these would now make sense to be the new default? The new force: no (Case 2) would be the least invasive IMO...

@sampsyo brainstorming request 🧐

@JOJ0 JOJ0 changed the title Lastgenre: fix track-level handling, fix multi-genre keep, streamline singleton log Lastgenre: Fix track-level handling, multi-genre keep, force behaviour, logging Nov 17, 2023
@JOJ0
Copy link
Member Author

JOJ0 commented Nov 17, 2023

Some more context / cross-linking:

The initial reason why I got my hands dirty with this plugin was when I realised that comma separated multi-genres where not recognized: #4751 (comment)

Here @arsaboo requests a feature that goes in direction of Case 3 above: #4750

@arsaboo
Copy link
Contributor

arsaboo commented Nov 17, 2023

So, we have two config options - force and keep_allowed, i.e., 4 options in all. Given that, keep_allowed is no in cases 1 and 2. Thus, here's a slightly modified behavior in the 4 cases above:

Case 1: overwrite all, only fresh last.fm genres remain

force: yes
keep_allowed: no

Case 2: Since keep_allowed is no, we only write last.fm genres when empty. There may be incorrect genres in pre-existing tags even after this, as this option is not touching pre-existing tags

force: no
keep_allowed: no

Case 3: keep present genres when whitelisted and add new last.fm genres

force: yes
keep_allowed: yes

Case 4: keep any string in the present genre tag; only write last.fm genres when empty. This will not touch pre-existing genre tags.

force: no
keep_allowed: yes

Thus, Case 4 seems like the best default choice. It does not affect existing genre tags and updates the empty ones. Case 3, on the other hand, is the most useful one (at least for me).

@sampsyo
Copy link
Member

sampsyo commented Nov 17, 2023

This brainstorming honestly sounds great, y'all. It is indeed really weird that the force: no mode can still update old genres; keeping all nonempty genres seems like it should at least be an option. I feel less specific about what the default should be, but I like your idea about decoupling the two aspects of the behavior (when to override existing, nonempty data and what to do to old data) into two different options.

@JOJ0
Copy link
Member Author

JOJ0 commented Nov 18, 2023

Ähem I might be slow or too tired already. Which of those 4 cases are now different from my proposal @arsaboo ? Sorry I must have missed it! Help! :-)

@arsaboo
Copy link
Contributor

arsaboo commented Nov 18, 2023

Not different....just a little more explicit about the force and keep_allowed config options. I think we have an agreement about the options.

@JOJ0 JOJ0 force-pushed the lastgenre_fixes branch 2 times, most recently from fb9f58d to c12b26b Compare September 17, 2024 16:34
@JOJ0 JOJ0 marked this pull request as ready for review September 17, 2024 16:38
Copy link

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@JOJ0 JOJ0 marked this pull request as draft September 17, 2024 16:39
@JOJ0
Copy link
Member Author

JOJ0 commented Sep 17, 2024

Hi @arsaboo! I finally managed to find time to almost finish this PR. The general behaviour and docs of the new config options combinations are finished. If you want to, an "early" review would be super helpful. Since it probably also for you is a long time ago it might be interesting what you think if you read through the docs. Is it 100% clear what force/keep_allowed options do? Certainly but only if you have the time, some playing around and checking if it also really works that way would be great. Thanks a ton!

@arsaboo
Copy link
Contributor

arsaboo commented Sep 17, 2024

@JOJ0 this is AWESOME 🎉🎉

The docs look reasonably clear. I will play with this. The debug logs are great to see what is going on.

@JOJ0 JOJ0 force-pushed the lastgenre_fixes branch 2 times, most recently from 796a3bf to a56098f Compare October 31, 2024 14:47
JOJ0 added 2 commits December 15, 2024 15:02
When `lastgenre.source: track` is configured,

- `lastgenre -a` _should not_ fall back to the album level genre (by
  making use of the with_album=False kwarg of the Libary's get method).
- `lastgenre -a`, when finally storing the genres of _an album_, should
  _not_ also write the tracks genres (by making use of the inherit=False
  kwarg of the Album's store method.
@JOJ0 JOJ0 force-pushed the lastgenre_fixes branch from 9108a13 to 1abdb5a Compare January 2, 2025 20:28
@JOJ0
Copy link
Member Author

JOJ0 commented Jan 2, 2025

Hi @arsaboo, thanks for testing! Awesome! I'm sorry I just force-pushed. Please reset your branch. I'm heavily working on refactoring tests for _get_genre() and fix a couple of bugs I notice. Please test with the current state and keep me posted if your issue remains. I will look into it! Thanks!

@arsaboo
Copy link
Contributor

arsaboo commented Jan 2, 2025

Keep me posted. I just updated and tested the latest build and saw the exact same thing. I can test it when you have fixed the bugs. Let us get this done.

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 2, 2025

Try setting your separator to a simple comma @arsaboo. genre is not a multifield yet. It's a regular string field. I mean basically it should work with any unicode character but I never tried it that way.

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 2, 2025

I'm not sure if I'm writing your null character correctly. Is it unicode 0000 ? I just added a testcase for it. It seems to work. You can run the test like this:

pytest 'test/plugins/test_lastgenre.py::test_get_genre'

This is the testcase:

# 13 - test with a null charachter as separator
(
{
"force": True,
"keep_allowed": True,
"source": "album",
"whitelist": True,
"separator": "\u0000"
},
"allowed genre",
{
"album": "another allowed genre",
},
("allowed genre\u0000another allowed genre", "keep + album"),
),

  • first dict in the tuple is config options
  • second is the existing (whitelisted) tag: "allowed genre"
  • third is what lastgenre gives us: "another allowed genre"
  • fourth is the expected outcome: both tags get combined with the separator

test runs fine, but I'm not sure if I wrote the null char correct, as mentioned above ;-)

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 2, 2025

@JOJ0 I just tested this branch and wondered if I am doing things right. Here's my config:

lastgenre:
    auto: no
    source: album
    count: 5
    separator: '\␀'
    force: yes
    keep_allowed: yes

What is whitelist set to?

@arsaboo
Copy link
Contributor

arsaboo commented Jan 3, 2025

Updated to the latest branch and ran with the following config:

lastgenre:
    auto: no
    source: album
    count: 5
    separator: ', '
    force: yes
    keep_allowed: yes
    whitelist: ~/.config/beets/genres/genres.txt

Log:

$ beet -v lastgenre
user configuration: /home/arsaboo/.config/beets/config.yaml
data directory: /home/arsaboo/.config/beets
plugin paths: /home/arsaboo/Downloads/whatlastgenre-master/plugin/beets/beetsplug
inline: adding item field initial_char
inline: adding item field folder
inline: adding item field clean_comments
inline: adding album field plex_matched
rewrite: adding template field artist Rahul Dev Burman
fetchart: google: Disabling art source due to missing key
fetchart: lastfm: Disabling art source due to missing key
Failed to set up Google AI: google.api_key not found
lyrics: Disabling google source: no API key configured.
Sending event: pluginload
library database: /home/arsaboo/.config/beets/musiclibrary.blb
library directory: /data/music
Sending event: library_opened
Parsed query: AndQuery([TrueQuery()])
Parsed sort: NullSort()
lastgenre: last.fm error: The artist you supplied could not be found
lastgenre: genre for album "Pagglait (Original Motion Picture Soundtrack)" (original): Filmi
Sending event: database_change
Sending event: database_change
...
Sending event: write
Sending event: after_write
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (original): Filmi
Sending event: database_change
...
Sending event: after_write
Sending event: cli_exit

It would be nice to see more logging so we can diagnose what is going on.

@arsaboo
Copy link
Contributor

arsaboo commented Jan 3, 2025

genre after the run:
image

@arsaboo
Copy link
Contributor

arsaboo commented Jan 3, 2025

@JOJ0 see this PR for the multi-valued tags.

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 3, 2025

@JOJ0 see this PR for the multi-valued tags.

I'm aware of this PR. The genre tag is not implemented yet as multi-valued-tag. What are you suggesting? I don't think that this lastgenre PR I'm working on is the right place for starting to implement multi-valued tags. This should be done in another PR, otherwise this will never be finished LOL ;-)

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 3, 2025

I added some debug messages. Look for something like this:

lastgenre: _last_lookup receives: album, get_album, ('Dom & Roland', 'Industry')
lastgenre: _resolve_genres received: ['drum and bass', 'techstep', 'drum n bass', 'moving shadow', 'cyberpunk', 'electronic', 'british', 'electroclash', 'albums i own', 'dark', '90s', 'atmospheric', 'jump up', '1998', 'futuristic', 'records i own', 'lesser known yet streamable albums', 'hardstep', 'bereps sub-brazil albums', 'dom and roland', 'personal cd rip', 'cold trip']
lastgenre: _resolve_genres (canonicalized and whitelist filtered): ['Drum And Bass', 'Techstep', 'Electronic', 'Electroclash']
lastgenre: fetch_genre returns (whitelist checked): Drum And Bass, Techstep, Electronic
lastgenre: _last_lookup returns: Drum And Bass, Techstep, Electronic
lastgenre: genre for album "Industry" (keep + album): Drum And Bass, Techstep, Electronic

Hope that helps, looking forward to see your logs :-)

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 3, 2025

Another hint. This log message here indicates that a "fallback" to the (original) value is the outcome of the main process _get_genre():

lastgenre: genre for album "Pagglait (Original Motion Picture Soundtrack)" (original): Filmi

So this means that neither a last.fm search on track-level, album-level, nor artist-level returned any "good" (valid and whitelisted) genres. Here a code snippet with the 3 stages that seem to fail:

image

Also doublecheck if your genres.txt file has all the genres you want. Especially after looking into the temporary log messages I added. The line lastgenre: _resolve_genres received ... should tell if and what you actually get from last.fm (which is then filtered according to your whitelist and other settings.

@arsaboo
Copy link
Contributor

arsaboo commented Jan 3, 2025

Now I am seeing some relevant information in the logs:

lastgenre: _last_lookup receives: album, get_album, ('Arijit Singh, Neelesh Mishra, Raftaar', 'Pagglait (Original Motion Picture Soundtrack)')
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Arijit Singh, Neelesh Mishra, Raftaar',)
lastgenre: last.fm error: The artist you supplied could not be found
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _resolve_genres received: ['Filmi']
lastgenre: _resolve_genres (canonicalized and whitelist filtered): ['Filmi']
lastgenre: genre for album "Pagglait (Original Motion Picture Soundtrack)" (original): Filmi
lastgenre: _last_lookup receives: album, get_album, ('Various Artists', 'Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)')
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, Arijit Singh, Jonita Gandhi, Ranveer Singh, Amitabh Bhattacharya',)
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, Arijit Singh, Shreya Ghoshal, Amitabh Bhattacharya',)
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, Arijit Singh, Shreya Ghoshal, Amitabh Bhattacharya',)
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, Arijit Singh, Shreya Ghoshal, Shadab Faridi, Altamash Faridi, Amitabh Bhattacharya',)
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, Darshan Raval, Bhoomi Trivedi, Amitabh Bhattacharya',)
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, Dev Negi, Amitabh Bhattacharya',)
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, Sachet Tandon, Amitabh Bhattacharya',)
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, Shadab Faridi, Altamash Faridi, Asees Kaur, Amitabh Bhattacharya',)
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, Shahid Mallya, Amitabh Bhattacharya',)
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, Shashwat Singh, Jonita Gandhi, Anand Bakshi, Santosh Anand',)
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, Shreya Ghoshal, Amitabh Bhattacharya',)
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, Sonu Nigam, Shilpa Rao, Amitabh Bhattacharya',)
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam, TUSHAR JOSHI, Shreya Ghoshal, Amitabh Bhattacharya',)
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _resolve_genres received: ['Filmi']
lastgenre: _resolve_genres (canonicalized and whitelist filtered): ['Filmi']
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (original): Filmi

I will test with a few more albums. I wonder why the LastFM matching is not so great (something we can work on in a separate PR). Similarly, we can work on the multi-valued tag later.

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 3, 2025

just as a test, try disabling white list 🤔

@arsaboo
Copy link
Contributor

arsaboo commented Jan 3, 2025

I think it is because of the multiple artists being queried:

album, get_album, ('Arijit Singh, Neelesh Mishra, Raftaar', 'Pagglait (Original Motion Picture Soundtrack)')

The artist for the above album is Arijit Singh on LastFM. The two additional artists are causing the issue.

When I modified the artist for one of the album to match the artist on LastFM, I saw:

lastgenre: _last_lookup receives: album, get_album, ('Pritam', 'Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)')
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam',)
lastgenre: _resolve_genres received: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _resolve_genres (canonicalized and whitelist filtered): ['World Music', 'Desi']
lastgenre: fetch_genre returns (whitelist checked): World Music, Desi
lastgenre: _last_lookup returns: World Music, Desi
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (keep + artist): Filmi, World Music, Desi

We will have to update our search procedure to allow for such variations, e.g., search with one artist or only album name, etc..

JOJ0 added 12 commits January 3, 2025 23:04
- Refactor and simplify logic of _get_genre()
- Add a config validation function.
- New default force: yes, keep_existing: yes (closest to original
  behaviour)
and decide to use the original default whitelist.  Some of the existing
tests do it that way as well.
we want to learn if maybe it's called wrongly, passed wrong data types,
etc.
When original genres were kept (keep_existing option), the final genre
count was "off". The reason was that reducing genres to that count is
handled in _resolve_genre which wasn't run.

- This fixes it by ensuring a run of _resolve_genre in
  _combine_and_label_genres.
- There is a small caveat though: New genres have been run through
  _resolve_genres already. When they are combined with the old ones,
  they run through it again. Let's take this into account for now and
  hope performance doesn't suffer too much.
- FIXME: Also fix the regression that even when _resolve_genre is run for a
  combined list of old/new genres, it was assumed the order is according
  to the canonicalization tree already. Now it's made sure that the
  _sort_by_depth method is always run (not only when prefer_specific is
  configured).
- FIXME: Add config validation for prefer_specific and canonical option
  combination. This was a (tiny) issue in the original plugin already
  and is now catched.
It seems self.whitelist being falsy is not relyable.
- Keep fetched genres as a list
- Split out reducing to count and joining to delimited string to
  separate method
- Leave a couple of temporary debug messages
- No idea where a missing separator (which is default) could
  happen...just set it explicitely.
- Since we now refactored fetch_genre to returning a list we can add
  mock multiple fetched gernes easier.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants