diff --git a/automations/all_actions.rst b/automations/all_actions.rst
index dadff78a62..4ab5a52b5e 100644
--- a/automations/all_actions.rst
+++ b/automations/all_actions.rst
@@ -36,6 +36,7 @@
 - **micro_wake_word:** ``start``, ``stop``
 - **microphone:** ``capture``, ``stop_capture``
 - **midea_ac:** ``beeper_off``, ``beeper_on``, ``display_toggle``, ``follow_me``, ``power_off``, ``power_on``, ``power_toggle``, ``swing_step``
+- **mixer_speaker:** ``apply_ducking``
 - **mqtt:** ``publish``, ``publish_json``
 - **number:** ``decrement``, ``increment``, ``operation``, ``set``, ``to_max``, ``to_min``
 - **output:** ``set_level``, ``turn_off``, ``turn_on``
diff --git a/components/speaker/mixer.rst b/components/speaker/mixer.rst
new file mode 100644
index 0000000000..124df1912a
--- /dev/null
+++ b/components/speaker/mixer.rst
@@ -0,0 +1,75 @@
+Mixer Speaker
+=============
+
+.. seo::
+    :description: Instructions for setting up mixer speakers in ESPHome.
+    :image: mixer.svg
+
+The ``mixer`` speaker platform allows you to mix audio sent to different source speakers into one output which is sent to another :doc:`speaker component </components/speaker/index>`. Individual source speakers may be ducked (made quieter) with the :ref:`apply ducking action <mixer_speaker-apply_ducking>`.
+
+When mixing multiple audio streams into one, they must have the same sample rate. If they are different, enable queue mode so that only one source speaker's audio is played at a time. Otherwise, use a :doc:`resampler speaker </components/speaker/resampler>` to send audio to the source speakers.
+
+This platform only works on ESP32 based chips.
+
+.. warning::
+
+    Audio and voice components consume a significant amount of resources (RAM, CPU) on the device.
+
+    **Crashes are likely to occur** if you include too many additional components in your device's
+    configuration. In particular, Bluetooth/BLE components are known to cause issues when used in
+    combination with Voice Assistant and/or other audio components.
+
+.. code-block:: yaml
+
+    # Example configuration entry
+    speaker:
+      - platform: mixer
+        output_speaker: speaker_id
+        source_speakers:
+          - id: announcement_mixer_input_speaker_id
+          - id: media_mixer_input_speaker_id
+
+Configuration variables:
+------------------------
+
+- **output_speaker** (**Required**, :ref:`config-id`): The :doc:`speaker </components/speaker/index>` to output the resampled audio.
+- **source_speakers** (**Required**, list): A list of source speaker inputs. Must have at least 2 and at most 8 speakers.
+
+    - **buffer_duration** (*Optional*, :ref:`config-time`): The duration of the internal ring buffer. Larger values can reduce stuttering but use more memory. Defaults to ``100ms``.
+    - **timeout** (*Optional*, :ref:`config-time`): How long to wait after finishing playback before releasing the bus. Set to ``never`` to never stop the speaker due to a timeout. Defaults to ``500ms``.
+    - All other options from :ref:`Speaker Component <config-speaker>`.
+
+- **num_channels** (*Optional*, positive integer): The number of audio channels to send to the output speaker. Either ``1`` or ``2``. Defaults to the output speaker's number of channels.
+- **queue_mode** (*Optional*, boolean): Enables queue mode. If enabled, audio isn't mixed but instead each source speaker's audio is played successively, starting with the first listed source speaker.
+- **task_stack_in_psram** (*Optional* boolean): Only with ``esp-idf``. Run the audio tasks in external memory. Defaults to ``false``.
+
+
+Automations
+-----------
+
+.. _mixer_speaker-apply_ducking:
+
+``mixer_speaker.apply_ducking`` Action
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This action ducks (reduces the volume) of the media stream.
+
+.. code-block::
+
+    on_...:
+      - mixer_speaker.apply_ducking:
+          id: media_mixer_source_speaker_id
+          decibel_reduction: 20
+          duration: 2.0s
+
+Configuration variables:
+
+- **decibel_reduction** (**Required**, int, templatable): The reduction of the media stream in decibels. Must be between 0 and 50.
+- **duration** (*Optional*, :ref:`config-time`, templatable): The length of time to transition between the current reduction level and the new reduction level. Defaults to ``0s``.
+
+
+See also
+--------
+
+- :doc:`index`
+- :ghedit:`Edit`
diff --git a/components/speaker/resampler.rst b/components/speaker/resampler.rst
new file mode 100644
index 0000000000..5fa23a7962
--- /dev/null
+++ b/components/speaker/resampler.rst
@@ -0,0 +1,52 @@
+Resampler Speaker
+==================
+
+.. seo::
+    :description: Instructions for setting up resampler speakers in ESPHome.
+    :image: waveform.svg
+
+The ``resampler`` speaker platform allows you to convert the sample rate of an audio stream and output it to another :doc:`speaker component </components/speaker/index>`.
+
+If the audio stream doesn't require resampling, it is automatically sent directly to the output speaker.
+
+This platform only works on ESP32 based chips.
+
+.. warning::
+
+    Audio and voice components consume a significant amount of resources (RAM and CPU) on the device.
+
+    **Crashes are likely to occur** if you include too many additional components in your device's
+    configuration. In particular, Bluetooth/BLE components are known to cause issues when used in
+    combination with Voice Assistant and/or other audio components.
+
+.. code-block:: yaml
+
+    # Example configuration entry
+    speaker:
+      - platform: resampler
+        output_speaker: output_speaker_id
+        sample_rate: 48000
+
+Configuration variables:
+------------------------
+
+- **output_speaker** (**Required**, :ref:`config-id`): The :doc:`speaker </components/speaker/index>` to output the resampled audio.
+- **buffer_duration** (*Optional*, :ref:`config-time`): The duration of the internal ring buffer. Larger values may reduce stuttering but use more memory. Defaults to ``500ms``.
+- **bits_per_sample** (*Optional*, positive integer): The audio sample bit depth after resampling. Defaults to the output speaker's bits per sample.
+- **sample_rate** (*Optional*, positive integer): Sample rate to convert to. Must be between ``8000`` and ``48000``. Defaults to the output speaker's sample rate.
+- **filters** (*Optional*, positive integer): The number of windowed sinc interpolation filters to use. Must be between ``2`` and ``1024``. Defaults to ``16``.
+- **taps** (*Optional*, positive integer): The number of taps per windowed sinc interpolation filter. Must between ``16`` and ``128`` and divisible by 4. Defaults to ``16``.
+- **task_stack_in_psram** (*Optional* boolean): Only with ``esp-idf``. Run the audio tasks in external memory. Defaults to ``false``.
+- All other options from :ref:`Speaker Component <config-speaker>`.
+
+Improving quality
+-----------------
+
+Resampling is processor intensive and should be avoided as much as possible. The audio quality is effected by the number of filters and the number of taps. Increasing the number of filters will increase the memory load. Increasing the number of taps will increase the CPU load.
+
+See also
+--------
+
+- `ART Audio Resampler (GitHub) <https://github.com/dbry/audio-resampler>`__
+- :doc:`index`
+- :ghedit:`Edit`
diff --git a/images/mixer.svg b/images/mixer.svg
new file mode 100644
index 0000000000..8102917b17
--- /dev/null
+++ b/images/mixer.svg
@@ -0,0 +1 @@
+<svg viewBox="0 0 88 25" id="svg5" xmlns="http://www.w3.org/2000/svg" xmlns:svg="http://www.w3.org/2000/svg"><defs id="defs9"/><path d="M5 0H83a5 5 0 015 5v15a5 5 0 01-5 5H5a5 5 0 01-5-5V5a5 5 0 015-5z" style="fill:#000" id="path2"/><g aria-label="Mixer" id="component-text" style="font-weight:900;font-size:25px;font-family:Montserrat;letter-spacing:1.1px;fill:#fffffc"><path d="M6.425 21V3.5h4.85l7 11.425h-2.55L22.525 3.5h4.85L27.425 21H22.05L22 11.6h.85L18.2 19.425H15.6L10.75 11.6H11.8V21z" id="path11"/><path d="M31.200012 21V7.325h5.65V21zm2.825-14.775q-1.55.0-2.475-.825-.925-.825-.925-2.05.0-1.225.925-2.05.925-.825 2.475-.825 1.55.0 2.475.775.925.775.925 2 0 1.3-.925 2.15-.925.825-2.475.825z" id="path13"/><path d="m38.975016 21 6.45-8.45-.15 3.25-6.15-8.475h6.375l3.025 4.525-2.35.175 3.325-4.7h5.925l-6.175 8.2V12.4l6.3 8.6h-6.475l-3.125-4.825 2.375.325-3.225 4.5z" id="path15"/><path d="m64.850045 21.25q-2.5.0-4.375-.925-1.85-.925-2.875-2.525-1.025-1.625-1.025-3.65.0-2.075 1-3.675 1.025-1.6 2.775-2.5 1.775-.9 3.975-.9 2.025.0 3.725.8 1.725.8 2.75 2.375 1.05 1.575 1.05 3.9.0.3-.025.675-.025.35-.05.65h-10.525V12.75h7.525l-2.125.725q0-.8-.3-1.35-.275-.575-.775-.875-.5-.325-1.2-.325t-1.225.325q-.5.3-.775.875-.275.55-.275 1.35v.85q0 .875.35 1.5t1 .95q.65.3 1.575.3.95.0 1.55-.25.625-.25 1.3-.75l2.95 2.975q-1 1.075-2.475 1.65-1.45.55-3.5.55z" id="path17"/><path d="M74.725044 21V7.325h5.375v4.125l-.875-1.175q.625-1.6 2-2.4 1.375-.8 3.3-.8v5q-.375-.05-.675-.075-.275-.025-.575-.025-1.275.0-2.1.675-.8.65-.8 2.275V21z" id="path19"/></g></svg>
\ No newline at end of file
diff --git a/images/waveform.svg b/images/waveform.svg
new file mode 100644
index 0000000000..a7e632f823
--- /dev/null
+++ b/images/waveform.svg
@@ -0,0 +1 @@
+<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" id="mdi-waveform" width="24" height="24" viewBox="0 0 24 24"><path d="M22 12L20 13L19 14L18 13L17 16L16 13L15 21L14 13L13 15L12 13L11 17L10 13L9 22L8 13L7 19L6 13L5 14L4 13L2 12L4 11L5 10L6 11L7 5L8 11L9 2L10 11L11 7L12 11L13 9L14 11L15 3L16 11L17 8L18 11L19 10L20 11L22 12Z" /></svg>
\ No newline at end of file
diff --git a/index.rst b/index.rst
index 5aa903f660..5c5610ee10 100644
--- a/index.rst
+++ b/index.rst
@@ -1051,6 +1051,8 @@ Speaker Components
 
     Speaker Core, components/speaker/index, speaker.svg, dark-invert
     I2S Speaker, components/speaker/i2s_audio, i2s_audio.svg
+    Mixer Speaker, components/speaker/mixer, mixer.svg
+    Resampler Speaker, components/speaker/resampler, waveform.svg
 
 Switch Components
 -----------------