-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat S3 Transfer Manager v2 GetObject/DownloadObject #2996
base: main
Are you sure you want to change the base?
Conversation
* recommit transfer manager v2 files * change pool to store slice pointer * add integ test for putobject * update go mod * minor changes for v0.1.0 * update tags * update tags * update integ test dependency version * change err var name * update go mod * change input/output type comment * minor change --------- Co-authored-by: Tianyi Wang <[email protected]> rebase from main rebase branch from main
move transfer manager v2 integration tests to within module move putobject integ test to transfer manager module
add getobject integ test
4021ed1
to
dd3bcc7
Compare
|
||
var output *GetObjectOutput | ||
if g.options.MultipartDownloadType == types.MultipartDownloadTypePart { | ||
if g.in.Range != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getObjectParts
/ getObjectRanges
? You've got two pretty big blocks here.
@@ -16,7 +19,7 @@ type Options struct { | |||
MultipartUploadThreshold int64 | |||
|
|||
// Option to disable checksum validation for download | |||
DisableChecksum bool | |||
DisableChecksumValidation bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really more complicated than just a bool now with the recent updates to flex checksums. We should probably promote this into the new RequestChecksumCalculation
enum we have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This client field was listed in SEP as boolean, but we can change that
PartBodyMaxRetries int | ||
|
||
// Logger to send logging message to | ||
Logger logging.Logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we hold off on adding this for the moment? We might want to evolve our logging API along the otel path in the future.
|
||
chunk.cur = 0 | ||
|
||
g.options.Logger.Logf(logging.Debug, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these logs are necessary.
} | ||
|
||
// Queue the next range of bytes to read. | ||
ch <- getChunk{w: g.w, start: g.pos - g.offset, withRange: g.byteRange()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So as far as I can tell, you're now doing concurrent downloads but you're buffering the entire object in memory before returning it to the caller - which is all well and good until someone goes to try and pull a multi-gigabyte object.
We're basically going to have to return a reader that "composes" the response bodies from inner getPart requests in order. That will allow you to stream the response back to the caller without that extra memory pressure. You might have a "scratch" buffer for that internal composition but naturally it won't be the full object size.
Implement v2 s3 transfer manager's GetObject/DownloadObject api bound to single union client which mimics normal service client's initialization and api call. User could now download object sequentially with GetObject or asynchronously with DownloadObject.
Test: passed unit test for GetObject and DownloadObject, passed integration test of uploading object via s3 client and validating its content after downloading through v2 transfer manager.