-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: Python-based Unbounded WebSocker IO Connector #33229
Comments
.take-issue |
Label p2 cannot be managed because it does not exist in the repo. Please check your spelling. |
Label cannot be managed because it does not exist in the repo. Please check your spelling. |
1 similar comment
Label cannot be managed because it does not exist in the repo. Please check your spelling. |
.set-labels P2,python,io,'new feature' |
cc: @damondouglas |
Down stream transforms, windows, and sinks don't appear to run until the I terminate the socket connection and the pipeline is drained. Is there a reason for this? |
What would you like to happen?
Apache Beam lacks a native Python-based IO connector that can ingest data directly from a socket. This feature would enable users to easily integrate streaming data sources, such as those emitting messages over TCP/IP sockets, into their Apache Beam pipelines.
Many real-time data sources, such as custom data generators, IoT devices, and legacy systems, often send data over sockets. Building a socket-based IO connector in Python would allow Beam pipelines to process this data seamlessly without requiring users to implement custom socket reading logic outside the Beam ecosystem.
Primary Question(?):
Any advice on implementing an unbounded source would be appreciated. I have only recently begun to dig into Apache Beam.
Additional Context
Existing IO connectors in Beam are often geared towards standard services like Kafka, Pub/Sub, etc. Adding support for sockets will cater to users dealing with more specialized or ad-hoc data sources.
Current approach to read from socket
Pipeline Example
The current pipeline stalls when combined with a window and aggregation.
Issue Priority
Priority: 3 (nice-to-have improvement)
Issue Components
The text was updated successfully, but these errors were encountered: