I’m playing around with a Bike extension for LLM call-and-response, inspired by BBEdit’s AI Worksheets. It’s really cool that this is implementable as an extension! I was thinking it’d be really nice to see responses stream in as they arrive, particularly for slower local models – is there a way to receive an HTTP reply chunk-by-chunk as it streams in, rather than only as a complete buffered result?
In terms of browser APIs, I think the Response.body ReadableStream API is the one that’d let us do it.
const res = await fetch(url, { method: "POST", body });
const reader = res.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { value, done } = await reader.read();
if (done) break;
const text = decoder.decode(value, { stream: true });
// parse SSE `data: …` lines here
}