Can an extension read an HTTP response incrementally?

I’m playing around with a Bike extension for LLM call-and-response, inspired by BBEdit’s AI Worksheets. It’s really cool that this is implementable as an extension! I was thinking it’d be really nice to see responses stream in as they arrive, particularly for slower local models – is there a way to receive an HTTP reply chunk-by-chunk as it streams in, rather than only as a complete buffered result?

In terms of browser APIs, I think the Response.body ReadableStream API is the one that’d let us do it.

const res = await fetch(url, { method: "POST", body });
const reader = res.body.getReader();
const decoder = new TextDecoder();
while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  const text = decoder.decode(value, { stream: true });
  // parse SSE `data: …` lines here
}

This isn’t possible now, but I will look to add for next release.

2 Likes

Please try latest release. Not well testing, but now works for me to connect to local Ollama model and stream back results.

2 Likes

Works great. Thanks Jesse!

1 Like