Make streaming APIs easy with enumerable methods
When you first discover Ruby on Rails, some things might strike you right away: namely the large number of enumerable methods and the blocks to run code in the middle of another method. Those features helped me translate my thoughts directly into code — and quickly made Ruby my favorite programming language.
Enumerables are great. You can use them to map, filter, reduce, and easily transform your data. Any method that returns an enumerable collection gives you that power without needing much work.
A method that yields values to a block can do more powerful things, but makes transformation harder. Any processing you want to do goes into the block, which doesn't feel as natural as chaining enumerable methods together. You have to think inside out (or even thread a block through multiple methods) before you get to the code that uses it.
An LLM example
Many large language model APIs, like the OpenAI chat API, can stream text back to you as it's generated. This is nice! You don't have to wait multiple seconds (or longer!) for a full response — you get feedback immediately. In most Ruby clients to these APIs, bits of text are yielded back to you one by one using a block.
To share a simple example, maybe you want to wrap each token in some metadata about the request that the front end can use:
llm_response do |text|
response = { request_id: params[:llm_request_id], text: text }
send_response_to_frontend(response)
end
But why does wrapping the response have to be done in the same place as sending the response? If llm_response
just returned a list, you could have some code that did the wrapping with map
and some other code that sent the response bit by bit. Why does this have to be different?
If you want to treat these values as a list, using Ruby means it's flexible. You should be able to do that! And Ruby has a few extremely useful methods that easily convert a block-type method to a list-type method.
The first is enum_for
. If I have a method that takes a block and yields values as they arrive:
def llm_response(prompt)
api.call_model(prompt) do |message|
yield message # Send the message back to the caller
end
end
The caller can use Object#enum_for
to translate it into something resembling a list instead:
def llm_response(prompt)
api.enum_for(:call_model, prompt)
end
From the documentation:
"Creates a new
Enumerator
which will enumerate by callingmethod
onobj
, passingargs
if any. What was yielded by method becomes values of enumerator."
In this example, it creates an enumerator. That enumerator will get each value by calling api.call_model(prompt)
. Every value yielded by call_model
will become the enumerator's values. If the model yields ["He", "l", "lo"]
, those will be the enumerator's values, the values each
sees, and so on.
You can now pass it to anything that expects an enumerator, and it will work just like a list of values:
response = llm_response(prompt)
# later...
process_response(response)
# or
response.map { ... }
# etc...
Why not make it automatic?
There's a trick you can use to make things even easier for your callers. You might see it if you poke around the Ruby standard library.
If you have a method that takes a block but could act like a list, you could add this as the first line:
return to_enum(__method__, <first arg>, <second arg>, <etc...>) unless block_given?
This will still allow callers to pass a block if they want. If they don't pass a block, it will return an enum (like in the second example above). This makes the difference between a block-type method and a list-type method almost nonexistent.
How does this work? to_enum
is just another name for enum_for
. And __method__
returns a symbol that contains the current method's name:
def llm_response(prompt)
return to_enum(__method__, prompt) unless block_given?
api.call_model(prompt) do |message|
yield message # Send the message back to the caller
end
end
response = llm_response(...) { ... }
For this example, if you don't pass a block, it's as if you wrote this:
response = to_enum(:llm_response, prompt)
And any time you ask response
for a new value, it will return the next value yielded by llm_response
. response
now acts like a list whose values are each item llm_response
would have yielded to a block.
Using lazy to avoid the wait
Once you start to work with those list elements, you might find a problem. If you call something like map
on one of these:
response.map { |m| process_message(m) }
You still have to wait until the entire response is done before you can do anything else, as the map
needs each element from the response. If you have to wait until the full response, there's no benefit in having a streaming API! So we've hit a dead end and have to go back to using a block, right? Not quite.
Ruby has a method, Enumerator#lazy, that you can add to your enumerable. With lazy, values will process one by one as needed instead of all at the same time. You can continue to chain transformations onto these responses as if it were an array, and you can pull values off it only when you actually use them.
response.lazy.map { |m| process_message(m) }.each { |m| send_message(m) }
Finally, you can rewrite the original example so it doesn't matter whether llm_response
returns a list or yields to a block. The same code works:
response = llm_response(prompt)
# later...
response = response.lazy.map { |text| { request_id: params[:llm_request_id], text: text } }
# even later...
response.each { |chunk| send_response_to_frontend(chunk) }
Give enumerable methods a try
Enumerable methods provide such a powerful way to process and transform your data in Ruby — but you can't always get an enumerable when you want one.
If you find yourself passing blocks through several layers of methods, having to think inside out, or wanting consistent code (whether your data is streamed over time or returned all at once), try enum_for
or to_enum
. It could make your APIs cleaner and your programming life easier.
Want to give enumerable methods a shot with us? Here are some open roles.