-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wav headers for PCM streaming doesn't work #894
Comments
It may be that the RIFF header is not always read correctly. do you have an example for me? |
Thank you for your response. I’ve done more testing and found:
const fullAudioBuffer = Buffer.concat(chunks);
const wavHeader = generateWavHeader(fullAudioBuffer.length, 24000, 1, 16);
const fullWavBuffer = Buffer.concat([wavHeader, fullAudioBuffer]);
// First send WAV header
const wavHeader = generateWavHeader(100000, 24000, 1, 16); // Estimated size
res.write(wavHeader);
// Then stream PCM chunks
audioStream.write(audioData); // Raw PCM chunks
This suggests the issue isn't just about sample rate, but how the library handles the RIFF header during streaming. Could you advise on the correct way to maintain WAV/RIFF structure during streaming? Or should we handle streaming PCM differently?" |
The sample rate is not changed within a file. If you have a data stream, have a look at the first RIFF header, where you will find the file length. You will probably find another RIFF header at the end of the first file. |
I tried starting the stream with one RIFF header at 24kHz, but the ESP32 still defaults to 44.1kHz during playback, causing the audio to play too fast. Based on your last comment, could you clarify:
|
It may be that there was a problem with wav mono. I had previously doubled the individual samples for the channels. This is now done by the I2S itself. Hope this works better and you can play your stream at the right speed. |
Thanks for your recent update to the library for handling mono playback. I've tried a few approaches to get the ESP32 to recognize the 24kHz sample rate for streaming, but so far, it’s still defaulting to 44.1kHz even with this new update. I have tried the following 3 things -
please help! I have exhausted all options here now. here's the 3 different ways (in code) I tried to implement for sending the RIFF header - const wavHeader = generateWavHeader(4800, 24000, 1, 16);
res.write(wavHeader);
audioStream.write(audioData); if (isFirstChunk) {
const wavHeader = generateWavHeader(audioData.length, 24000, 1, 16);
const fullFirstChunk = Buffer.concat([wavHeader, audioData]);
audioStream.write(fullFirstChunk); // Send combined header + first chunk
isFirstChunk = false;
} else {
audioStream.write(audioData); // Send subsequent chunks as raw PCM
} const wavHeader = generateWavHeader(audioData.length, 24000, 1, 16);
const chunkWithHeader = Buffer.concat([wavHeader, audioData]);
audioStream.write(chunkWithHeader); // Send header with every chunk also note if i combine all chunks and then stream that, the esp32 reads the RIFF header and changes the sample rate correctly.. here's the code for that - const fullAudioBuffer = Buffer.concat(chunks);
const wavHeader = generateWavHeader(fullAudioBuffer.length, 24000, 1, 16);
const fullWavBuffer = Buffer.concat([wavHeader, fullAudioBuffer]);
res.setHeader('Content-Type', 'audio/wav');
res.setHeader('Content-Length', fullWavBuffer.length);
res.write(fullWavBuffer); wavheader function function generateWavHeader(dataSize, sampleRate = 24000, channels = 1, bitsPerSample = 16) {
const byteRate = sampleRate * channels * bitsPerSample / 8;
const blockAlign = channels * bitsPerSample / 8;
const header = Buffer.alloc(44);
header.write("RIFF", 0);
header.writeUInt32LE(36 + dataSize, 4);
header.write("WAVE", 8);
header.write("fmt ", 12);
header.writeUInt32LE(16, 16);
header.writeUInt16LE(1, 20);
header.writeUInt16LE(channels, 22);
header.writeUInt32LE(sampleRate, 24);
header.writeUInt32LE(byteRate, 28);
header.writeUInt16LE(blockAlign, 32);
header.writeUInt16LE(bitsPerSample, 34);
header.write("data", 36);
header.writeUInt32LE(dataSize, 40);
return header;
} |
This issue is stale because it has been open for 30 days with no activity. |
I am having trouble even playing audio from It is 24000 pcm data generated by elevenlabs and before saving the file I added the wav header def create_wav_buffer(data: bytes, sample_width: int, sample_rate: int) -> BytesIO:
wav_buffer = BytesIO()
with closing(wave.open(wav_buffer, "wb")) as wav_file:
wav_file.setnchannels(1) # Mono
wav_file.setsampwidth(sample_width // 8) # x-bit PCM
wav_file.setframerate(sample_rate) # x kHz sample rate
wav_file.writeframes(data)
wav_buffer.seek(0)
return wav_buffer
wav_buffer = create_wav_buffer(data=audio_data, sample_width=16, sample_rate=24000) |
This issue is stale because it has been open for 30 days with no activity. |
When streaming WAV format:
OpenAI sends 24kHz PCM chunks
We add WAV headers specifying 24kHz
Library reads 24kHz from WAV header
But plays too fast
Same library works correctly for:
Complete WAV files (plays at correct 24kHz)
MP3 streaming (auto-detects rate)
Any non-streaming WAV playback
Root cause appears to be:
I2S initialized at 44.1kHz in constructor
During WAV streaming, sample rate change doesn't stick
Resulting in 24kHz audio playing at 44.1kHz speed
This suggests the issue is specific to how Audio.h handles sample rates during WAV/PCM streaming, not with regular WAV file playback.
The text was updated successfully, but these errors were encountered: