-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Stream API deployment #573
base: main
Are you sure you want to change the base?
Conversation
@Vinlic 感谢你提供的stream api方案。我这边的测试发现,server端报这个错(见下边server端报错情况),但不影响程序的运行。client 端用requests是报这个错:requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))。不知道你这边有遇到过吗? |
@ALL From GPT-4 这个错误表明在尝试使用不可用的CUDA设备。 首先,请检查您的系统上可用的GPU设备数量。在命令行中运行以下命令: nvidia-smi 这将显示您的系统上的GPU设备以及其他相关信息。请确保您选择的设备ID(在脚本中为 如果您只有一个GPU设备,将 DEVICE_ID = "0" 对于客户端 例如,您可以使用 pip install httpx 然后,您可以使用以下代码来接收服务器发送的事件: import httpx
url = "http://127.0.0.1:8010"
data = {
"input": "你好ChatGLM",
"max_length": 2048,
"top_p": 0.7,
"temperature": 0.95,
"history": [],
"html_entities": True,
}
async with httpx.AsyncClient() as client:
async with client.stream("POST", url, json=data) as response:
async for line in response.aiter_lines():
print(line) 这应该解决您在客户端遇到的问题。 |
我刚提了个PR,#808 , 比我实现的好,不过你的代码里没有把history返回; 重新整理一下呗,完善一下,我看SSE按用你的代码挺好; |
你好,有请求的示例吗?你这个请求页面可以发一下吗?感谢!!! |
@Vinlic 的工作很棒 API部署 python stream_api.py 默认部署在本地的 8010 端口,通过 POST 方法进行调用 curl -X POST "http://127.0.0.1:8010" \
-H 'Content-Type: application/json' \
-d '{"input": "你好"}' 得到的返回值为 stream context |
请问如果用requests.post()的形式,该如何请求呢? |
This script implements the streaming transmission of model response results, eliminating the need for users to wait for a complete response of the content.
When accessing the interface, it will return an 'event-stream' stream.