Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support blob streams #99

Open
adamstoffel opened this issue Dec 15, 2021 · 10 comments
Open

Support blob streams #99

adamstoffel opened this issue Dec 15, 2021 · 10 comments

Comments

@adamstoffel
Copy link

The Blob Trigger binding for NodeJS functions always binds data as a Buffer even if 'dataType' is set to 'stream' or 'string'.

Repro steps

  1. Create a node-based function using the blob trigger
  2. Set "dataType" to "stream" in function.json:
{
  "bindings": [
    {
      "name": "fileBlob",
      "type": "blobTrigger",
      "direction": "in",
      "path": "container/{fileName}.csv",
      "connection": "AzureWebJobsStorage",
      "dataType": "stream"
    }
  ],
  "scriptFile": "../dist/myFunction/index.js"
}

Expected behavior

fileBlob input parameter should be some kind of Stream type

Actual behavior

fileBlob input parameter is a Buffer

Known workarounds

None

Related information

Functions runtime ~3
Node ~14

@v-bbalaiagar
Copy link

Hi @adamstoffel, Thank you for your feedback! We will check for the possibilities internally and update you with the findings.

@v-bbalaiagar
Copy link

Transferring this issue to NodeJS-worker repo for further insights.

@v-bbalaiagar v-bbalaiagar transferred this issue from Azure/azure-webjobs-sdk Jan 21, 2022
@ejizba
Copy link
Contributor

ejizba commented Jan 21, 2022

The Node.js worker doesn't support streaming data. Here is an issue on the host repo with some of the background: Azure/azure-functions-host#1361

We might be able to make progress on this, though. @gohar94 implemented a 'shared memory' feature that is meant to help with memory usage in general, although I don't know if it necessarily passes a stream to the users's code. It's only implemented for the Python worker so far. @gohar94 does the Python implementation pass along a stream to the user? Could or should we do that when implementing this feature for Node.js? Edit: Never mind this isn't related

@gohar94
Copy link

gohar94 commented Jan 26, 2022

Hi @ejizba - the shared memory feature on Python passes the whole set of bytes to the user function as of today. However, I do think that some stream-like abstraction can be given to the user (like io.BytesIO over the mmap). It will likely still all be loaded in memory but the user could operate on those bytes as a stream, if that's what you're asking.
Happy to take a look if you decide to do this for Node.js.

@ejizba
Copy link
Contributor

ejizba commented Jan 26, 2022

Yeah I'm less interested in doing a stream if it's just an abstraction, not an actual stream with all the benefits.

How likely is the shared memory feature to help with 'out of memory' issues? Or is it realistically just for performance?

@gohar94
Copy link

gohar94 commented Jan 26, 2022

Yeah I'm less interested in doing a stream if it's just an abstraction, not an actual stream with all the benefits.

How likely is the shared memory feature to help with 'out of memory' issues? Or is it realistically just for performance?

I believe it does reduce one extra copy (which gRPC creates when transferring between the host and the worker).
It does seem possible to do full streaming from the host to the worker using shared memory but it would involve some more work (like communicating what pieces of data become available and which ones have been consumed etc.) Would be an interesting project though.

@ejizba ejizba changed the title In NodeJS, Blob trigger always binds as binary buffer regardless of 'dataType' setting Support stream dataType Jan 28, 2022
@ejizba ejizba added this to the Tracking milestone Jan 28, 2022
@MarioArriaga92
Copy link

Is there an ETA on this?

@ejizba
Copy link
Contributor

ejizba commented Jun 30, 2022

@MarioArriaga92 no there is not an ETA for this. We want to do it, but it will not be simple and so far other work has been higher priority. Specifically, you can see our roadmap here with the work we currently have prioritized:
https://github.com/Azure/azure-functions-nodejs-worker/wiki/Roadmap

@sig9
Copy link

sig9 commented Aug 4, 2022

Having to consider other hosting options due to this limitation.
My use case is to implement a server api in node that calls an OpenAI api function which returns a Server-Sent Events stream. I would like to relay those responses to the browser. I have a minimal git repo that demonstrates the issue. No deployment needed to demonstrate this, the azure-functions-core-tools package behaves the same way as a deployed app.
https://github.com/sig9/nuxt-app-sse/

@ejizba ejizba changed the title Support stream dataType Support streams Sep 2, 2022
@ejizba ejizba changed the title Support streams Support streams (azure resources) May 31, 2023
@ejizba ejizba transferred this issue from Azure/azure-functions-nodejs-worker May 31, 2023
@ejizba ejizba modified the milestones: Tracking, Backlog Candidates May 31, 2023
@ejizba
Copy link
Contributor

ejizba commented May 31, 2023

I'm cleaning up the stream issues and will use this to track support for streaming Azure resources. This was finally unblocked for us in the host: Azure/azure-functions-dotnet-worker#1081. The .NET Isolated worker was the first one to use it, but we should be able to benefit in Node.js as well.

Unfortunately I don't have any ETAs. For now we're focused on GA-ing the v4 programming model, but we will keep the roadmap updated with any stream plans: https://github.com/Azure/azure-functions-nodejs-library/wiki/Roadmap

Streaming http will be done separately, tracked by #97

@castrodd castrodd removed their assignment Aug 11, 2024
@ejizba ejizba changed the title Support streams (azure resources) Support blob streams Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants