Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leave SoundWire IRQs enabled during device removal #5264

Closed

Conversation

charleskeepax
Copy link

Ok, I think I have mostly understood why the IRQs are disabled early, and this is probably about as good a solution as I can find to the problem. Also one other little random fix, I found whilst working through things. I will also be sending up a small patch to the soundwire framework and some cs42l43 patches all as part of this unbinding stuff, I will CC you guys on those too.

@@ -1109,7 +1109,7 @@ static int sof_card_dai_links_create(struct snd_soc_card *card)
}

/* One per DAI link, worst case is a DAI link for every endpoint */
sof_dais = kcalloc(num_ends, sizeof(*sof_dais), GFP_KERNEL);
sof_dais = kcalloc(num_ends + 1, sizeof(*sof_dais), GFP_KERNEL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not fix this at source i.e. where its being dereferenced ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial intention was to reduce the number of parameters getting passed around, I guess the alternative fix would be something like:

diff --git a/sound/soc/intel/boards/sof_sdw.c b/sound/soc/intel/boards/sof_sdw.c
index 7ed49416c1a4..98b17d1e82f2 100644
--- a/sound/soc/intel/boards/sof_sdw.c
+++ b/sound/soc/intel/boards/sof_sdw.c
@@ -893,7 +893,7 @@ static int create_sdw_dailink(struct snd_soc_card *card,
 
 static int create_sdw_dailinks(struct snd_soc_card *card,
                               struct snd_soc_dai_link **dai_links, int *be_id,
-                          struct asoc_sdw_dailink *sof_dais,
+                        struct asoc_sdw_dailink *sof_dais, int num_sof_dais,
                               struct snd_soc_codec_conf **codec_conf)
 {
        struct asoc_sdw_mc_private *ctx = snd_soc_card_get_drvdata(card);
@@ -904,7 +904,8 @@ static int create_sdw_dailinks(struct snd_soc_card *card,
                intel_ctx->sdw_pin_index[i] = SOC_SDW_INTEL_BIDIR_PDI_BASE;
 
        /* generate DAI links by each sdw link */
-   while (sof_dais->initialised) {
+ i = 0;
+ while (i++ < num_sof_dais && sof_dais->initialised) {
                int current_be_id;
 
                ret = create_sdw_dailink(card, sof_dais, dai_links,
@@ -1170,7 +1171,7 @@ static int sof_card_dai_links_create(struct snd_soc_card *card)
        /* SDW */
        if (sdw_be_num) {
                ret = create_sdw_dailinks(card, &dai_links, &be_id,
-                                     sof_dais, &codec_conf);
+                                   sof_dais, num_ends, &codec_conf);
                if (ret)
                        goto err_end;
        }

Happy to go with that if we prefer, although personally I slightly prefer the approach of just having a spare sof_dais entry.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this version with a comment to explain why we need the additional sof_dai

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, best to fix at source with a comment otherwise new clients could make the same mistake.

Comment on lines +261 to 265
mutex_lock(&ctx->link_lock);
list_add_tail(&link->list, &ctx->link_list);
mutex_unlock(&ctx->link_lock);
devm_add_action_or_reset(&ldev->auxdev.dev, intel_link_list_del, link);
bus = &link->cdns->bus;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lot of locking here to manage the in flight IRQs as we iterate the list. Have you tried to safely iterate/remove the list elems ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe one can safely traverse a list that will be accessed from multiple threads without locking. There are _safe versions of the list macros, but they are only safe against accesses from within your own thread, say if the loop over the list might also modify the list.

That said the actual contention on this lock should be close to zero, since the list is only modified during probe and remove. The rest of the time the only thing accessing the list is the IRQ handler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wondered if you able to safely remove the device from the list before device->remove() and device->irq(), but it seems not in this case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trouble is the IRQ is part of communicating with the device, and one generally wants to communicate with the device whilst doing a remove to park the device in a sensible state. The simplest way of avoid a clash on the list is to mask the IRQ before the remove, which is infact what the code used to do. But this causes us problems as we can't communicate with the device anymore.

@lgirdwood
Copy link
Member

@ujfalusi any comments before @charleskeepax post upstream ?

The code uses the initialised member of the asoc_sdw_dailink struct to
determine if a member of the array is in use. However in the case the
array is completely full this will lead to an access 1 past the end of
the array, expand the array by one entry to include a space for a
terminator.

Fixes: 27fd36a ("ASoC: Intel: sof-sdw: Add new code for parsing the snd_soc_acpi structs")
Signed-off-by: Charles Keepax <[email protected]>
Currently the auxiliary device for the link disables IRQs before it
calls sdw_bus_master_delete(). This has the side effect that
none of the devices on the link can access their own registers whilst
their remove functions run, because the IRQs are required for bus
transactions to function. Obviously, devices should be able to access
their own registers during disable to park the device suitably.

It would appear the reason for the disabling of the IRQs is that the IRQ
handler iterates through a linked list of all the links, once a link is
removed the memory pointed at by this linked list is freed, but not
removed from the linked_list itself. Add a list_del() for the linked
list item, note whilst the list itself is contained in the intel_init
portion of the code, the list remove needs to be attached to the
auxiliary device for the link, since that owns the memory that the list
points at. Locking is also required to ensure the IRQ handler runs
before or after any additions/removals from the list.

Signed-off-by: Charles Keepax <[email protected]>
@charleskeepax
Copy link
Author

Do you guys want me to push these patches upstream myself, or would you rather take them through your tree? And if you do want me to push them, I assume I am ok to make the approvals a Reviewed-by tag? Sorry never quite sure what is easiest for you guys :-)

@bardliao
Copy link
Collaborator

Do you guys want me to push these patches upstream myself, or would you rather take them through your tree? And if you do want me to push them, I assume I am ok to make the approvals a Reviewed-by tag? Sorry never quite sure what is easiest for you guys :-)

@charleskeepax I think it would be faster if you push these patches upstream yourself. And yes, please feel free to add our Reviewed-by tags.

@bardliao bardliao closed this Dec 11, 2024
@charleskeepax
Copy link
Author

Sent upstream now, will do any further review/changes there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants