-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding term to indicate a stream of data #1044
Comments
It might be interesting for discovery TF since a TD / JSON-LD can be streamed: See here |
From today's TD call:
|
We should try to use subscription subprotocols whenever possible, because we can semantically describe the payload. Geopose may benefit from a subscription model, even if it doesn't use the SSE subprotocol. In other cases like pagination, multipart messages, or event logging, it may be text lines or arbitrary boundaries, and in these cases a hint is needed to set up the connection with chunked encoding and perhaps keepalive. WRT this, I wonder if the hint belongs in the protocol binding. |
I think we have four scenarios:
Cases 1 and 4 do not need any hint. In 1 the client does all the job and applications can choose to let him to buffer everything or receive the chucks as they arrive. Point 3 I think that it might be sufficient to look at the data schema of the response to understand that the data is chunked. On the other hand, we definitely need a hint for point 2 since the consumer would expect fully contained data in one single protocol message but the server would send them chucked over a single connection. About 3 I feel that the streaming tag would be nice to have so that client can promptly read the data in a streaming mode. Notice that sometimes the streaming nature of the data is implicit in the protocol or subprotocol ( consider for example an HTTP form that specifies HTTP Live streaming as subprotocol). |
Use case: JSON Streaming. See https://github.com/w3c/wot-testing/tree/main/events/2024.11.Munich/Documentation/Intel and the Ollama API, which streams multiple JSON objects in a response. Note however that in this particular API the use of streams is optional and indicated by a "stream" flag in the request payload, and termination is also given by a "done" flag in the response objects. However... if we have a "stream" flag in the TD that would probably still work, since a stream with one element is still a stream, and termination can be indicated by the connection closing. So the stream flag in the request object, if used, is just telling the server to stream, the proposed stream flag in the TD is just telling the client (TD consumer) that the server MAY send multiple JSON objects before terminating the connection. |
I also assume we are just concerned about streaming responses FROM a Thing (acting as a server). The other way might be interesting (i.e. streaming requests from clients...) but I propose we declare them out of scope for this issue. In this case the "stream" flag would only apply to responses, which is a bit asymmetric, but... We could however use the term "outputStream" for the flag (instead of just "stream") and reserve "inputStream" for possible future use. Also: just on actions? I could also see streaming reads from properties, that would send a new object periodically or when the property changes. Another thought is maybe we could use an "op" for this, like "readstream". But limiting the scope to actions would be ok and probably the right thing to do for simplicity. Although... rather than a "stream" flag on actions, we could also have an op like "invokestream". |
Other than a video stream, I'm not sure I fully understand the distinction between something that is a stream and something that isn't for the purposes of this new term. E.g. would Server-Sent Event connections (unidirectional) and WebSocket connections (bidirectional) count as streams, or are those cases where the hint is not necessary? Another use case I came across this week is Smart Core OS exposing an API using gRPC over HTTP/2, which notably supports bidirectional streaming. Regarding restricting the scope of streams to action affordances, I would actually have thought they were more common for observed properties, and events..? |
A video stream is just a special kind of Same could be argued for the Ollama API. Once we express relationships across affordances properties can be used to control its explicit state (playing, paused). |
From the F2F discussion I remember, as well as a use case in the WoT CG from @RobWin, it is more relevant for actions. You invoke an action, but the response is not data you get in one response; it is a stream of responses that don't have to be like a video stream but can be sporadically sent by the Thing. The current way to model it would be by invoking an action and observing it. |
I don't have a strong preference but I think a property with |
@egekorkan But an output stream with multiple message is also needed. Like a Chatbot which does not only return a single response, but multiple intermediate responses, which explain what he is doing, and then the final response. A single Right now we don't have a usecase to stream properties. But let's assume a property is something like a conversation history, then it might make sense to be able to stream back to a consumer. But this could also be implemented with an action. |
So like in OpenAPI v3 we would need to be able to use Multipart Messages but supported by multiple protocol bindings. |
@lu-zero wrote:
@relu91 wrote:
For the record, in WebThings Gateway we model a video stream as a property with a @lu-zero wrote:
See the discussion of a potential @RobWin wrote:
This makes me very cautious about scope creep because my understanding is that Thing Descriptions are designed to describe physical devices with physical affordances, not to describe any piece of abstract software. I think we should be very cautious about extending the WoT information model in an attempt to describe affordances of software applications, unless there are also use cases in physical devices. Otherwise the problem space is completely unbounded and we will end up with an unusable mess. I would suggest that describing an AI software agent in general is out of scope for a Thing Description. That said, streaming audio to and from an IoT device is a perfectly valid use case.
If it was just a continuous audio stream in each direction I think that could be modelled as two properties, one readOnly and one writeOnly. If the request and response were discrete audio files then it could be modelled as an action input and output. If you want to stream a back and forth conversation over a single action affordance that's definitely trickier. I don't know what protocol and content type you're using, but could there be an initial input stream which ends, then an output stream which stays open (broadcasting silence during pauses) until the full response is complete? Or does the input stream need to stay open as well? If so, could a new input be a new action request on the same action affordance? In general I think action affordances are built on the assumption that there's a discreet input and then a discreet output. If you want a bidirectional stream which is kept open for a back and forth conversation I think that probably needs to be modelled a different way, and if this isn't already supported by an existing protocol your best bet might actually to define a new protocol, or subprotocol. In some ways it's similar to what we're doing with the Web Thing Protocol WebSocket Sub-protocol. |
Then it's a device which you can talk to and it answers you. The potential of WoT (Web of Things) extends far beyond simply abstracting devices. At DT, we’ve successfully demonstrated this by abstracting both Weather Stations and Weather Services using WoT. While the end customer might not perceive much difference in the functionality, the impact on the system architecture was profound. By leveraging WoT, the system treats everything—whether a physical device or a cloud-based Web Service — as a WoT Thing. This abstraction eliminated the distinction between hardware and software services. You might not like it, but I opened another discussion: https://github.com/orgs/modelcontextprotocol/discussions/56 |
@RobWin wrote:
I'm afraid we're going to have to agree to disagree on this one. The scope of the Web of Things is already impossibly large, without extending it to not-things. General web services can easily be used in conjunction with the Web of Things without needing to model every service as a Thing. "When you have a hammer, every problem looks like a nail." |
Yes, unfortunately we have to disagree on this. Web Services are fragmented. Different protocols, different communication patterns. WoT was perfectly suitable for us to harmonize access to Web Services as well - based on the properties/actions/events abstraction and forms. Much better than any OpenAPI or AsyncAPI did. As said we have experience integrating hundreds of device vendors and device types and different web services, and I can tell you that WoT helped us a lot. It sad to tell people, you are doing it wrong, when it perfectly works. And I'm really happy that @mmccool is looking into a similar direction with his Thing Description examples of AI Services. The Web of Agents or Internet of Agents will come. And I would prefer if it to be based on WoT than the API hell from OpenAI right now. But I don't want to distract this valuable discussion about streaming here. I gave my input, because @egekorkan asked. |
@RobWin wrote:
I'm not saying it doesn't work. There are plenty of examples of add-ons for WebThings Gateway which model not-things as Things and I get why that's attractive when the tools are already available. I'm saying that it would be bad for the Web of Things to start adding features to specifications soley for use cases that fall outside of the scope of the Internet of Things. "The Web of Things (WoT) seeks to counter the fragmentation of the IoT", not the whole Internet. Scope creep is dangerous for any standard. But again, I'm also not saying that bi-directional streaming of audio should be out of scope either, just that we should frame it in terms of concrete IoT use cases. So circling back to the topic at hand... It seems like the |
yup exactly, with the only difference that I would not bother adding
That's my understanding, see also #1044 (comment) |
@relu91 wrote:
I agree, that may be leftover from some time in the distant past when the |
@benfrancis wrote:
That property provides a resource unbounded?
There are many ways to model a playback system. You may use actions for the control since the actual change of state could be a monitor connected. (the playback process whole is an hidden state) Or you may actually use the same control while the system is broadcasting and consumer may subscribe to the specific stream.
+1, and I add on top that actions are good to model something that has an hidden state that you cannot touch directly and may evolve over time. |
@RobWin wrote:
What you are doing is sending a discontinuous amount of information, so you can feed it in using either a very odd write-only property or an action of kind That it is a "stream" is just a protocol detail IMHO. The data schema would always be an Array of something. |
Problem
The new Scripting API allows processing data that is sent as a stream, instead of a bulk payload that can be parsed instantly. The stream is sometimes obvious by the protocol or
contentType
(a video stream) but it may not be obvious since HTTP also allows streaming of data and it is possible to send anapplication/json
as a stream.Proposal
Thus, the consumer needs to know that values will be a stream or not. So adding a term like
stream
that can have a boolean value would solve the problem and remove the need for out-of-band information. It would be better to find a widely used term to describe such cases.Analogy
JSON can be sent in different encoding formats and this is currently indicated by the
contentCoding
term in the TD spec. Not having this term would result in either saying thatNote: This use case is not limited to Scripting API.
Note2: Should this be opened in use cases repo first @mlagally ?
The text was updated successfully, but these errors were encountered: