Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

responseBuffer.toString() always returning binary/random characters #1070

Open
ItsAnoch opened this issue Feb 3, 2025 · 2 comments
Open

Comments

@ItsAnoch
Copy link

ItsAnoch commented Feb 3, 2025

Discussed in #1069

Originally posted by ItsAnoch February 2, 2025
I've been trying to intercept the HTML response and use JSDOM to modify some elements before returning the edited HTML. I followed the response interceptor recipe, which worked for websites like example.com. However, it didn’t work for hianime.to or movies2watch.tv.

The response often appeared as binary or a corrupted file, so I had to add extra logic, including setting headers and handling CORS. After implementing these changes, returning the responseBuffer displayed the page correctly, but calling responseBuffer.toString("utf8") still resulted in binary/gibberish. I need the text content to process it through JSDOM.

import express from 'express';
import { createProxyMiddleware, responseInterceptor } from 'http-proxy-middleware';

const app = express();

const proxyMiddleware = createProxyMiddleware({
  target: 'https://localhost:3000',
  changeOrigin: true,
  followRedirects: true,
  selfHandleResponse: true,
  pathRewrite: { '^/api?url=': '' },
  secure: false,
  router(req) {
    const targetUrl = req.query["url"] as string;
    return targetUrl || "https://example.com";
  },
  on: {
    proxyRes: responseInterceptor(async (responseBuffer, proxyRes, req, res) => {
      // Copy over all response headers
      Object.keys(proxyRes.headers).forEach(key => {
        res.setHeader(key, proxyRes.headers[key] as string);
      });
  
      const responseText = responseBuffer.toString("utf8");
      console.log(responseText.slice(0, 200)); // Log: (�/�X<F'w&��x`ma���D�t�ۆ�:���o,�

      return responseBuffer; // This works fine now
    }),
  },

});

app.use((req, res, next) => {
  res.header('Access-Control-Allow-Origin', '*');
  res.header('Access-Control-Allow-Methods', 'GET, POST, OPTIONS');
  res.header('Access-Control-Allow-Headers', 'Origin, X-Requested-With, Content-Type, Accept');
  next();
});

// Add CORS headers
app.use('/api', proxyMiddleware);

app.listen(3000, () => {
  console.log('Enhanced proxy server running on port 3000');
});

Am I doing something wrong, or is this an issue with the library?

PS: Is there also a way to allow the CSS of the website to work as well? Returning the responseBuffer only returns the HTML, but the styling seems to have disappeared.

@charIeszhao
Copy link

Same here. Some of my API responses will contain garbled characters around the actual json data. Like this:

(�/��Xh��{
    "foo": "bar"
}���ڱ��

@charIeszhao
Copy link

I think I just found the root cause. The proxy middleware doesn't handle zstd compression.
https://github.com/chimurai/http-proxy-middleware/blob/master/src/handlers/response-interceptor.ts#L75-L87

The code shows it only handles gzip, deflate and br.

So the workaround is to set request header to tell the server to NOT use zstd.

createProxyMiddleware({
  target: 'your-target-url',
  on: {
    proxyReq: (proxyRequest) => {
      proxyRequest.setHeader('accept-encoding', 'gzip, deflate, br');
    }
  }
})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants