-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc: free buffer after client disconnect #7378
Conversation
@@ -934,7 +938,7 @@ static void rpc_serve_client(ggml_backend_t backend, sockfd_t sockfd, size_t fre | |||
} | |||
switch (cmd) { | |||
case ALLOC_BUFFER: { | |||
rpc_alloc_buffer(backend, input, output); | |||
allocated_buffers.push_back(rpc_alloc_buffer(backend, input, output)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the allocated buffer into list.
@@ -950,7 +954,7 @@ static void rpc_serve_client(ggml_backend_t backend, sockfd_t sockfd, size_t fre | |||
break; | |||
} | |||
case FREE_BUFFER: { | |||
rpc_free_buffer(input); | |||
allocated_buffers.remove(rpc_free_buffer(input)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the freed buffer from list
|
||
for (auto buff: allocated_buffers) { | ||
ggml_backend_buffer_free(buff); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
free the reminding buffers.
I think a better approach would be to track allocated buffers with |
emm, just forgot the also, for the but agree that |
close this PR and wait for @rgerganov 's fix, related discussion: #7407 |
In PR #6829, @rgerganov add support to rpc backend, after using it for several days, I have noticed an issue:
Upon investigating the source code, I discovered that instead of releasing the memory, we simply exit the inner loop and immediately wait for a new connection (ggml-rpc.cpp#L1027).
So here I create this PR, which monitor the
ALLOC_BUFFER
andFREE_BUFFER
command, maintaining a list of allocated buffers, then free the remind buffer after client disconnect.