Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat S3 Transfer Manager v2 GetObject/DownloadObject #2996

Open
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

wty-Bryant
Copy link
Contributor

Implement v2 s3 transfer manager's GetObject/DownloadObject api bound to single union client which mimics normal service client's initialization and api call. User could now download object sequentially with GetObject or asynchronously with DownloadObject.

Test: passed unit test for GetObject and DownloadObject, passed integration test of uploading object via s3 client and validating its content after downloading through v2 transfer manager.

@wty-Bryant wty-Bryant requested a review from a team as a code owner January 31, 2025 21:58
wty-Bryant and others added 25 commits February 13, 2025 17:10
* recommit transfer manager v2 files

* change pool to store slice pointer

* add integ test for putobject

* update go mod

* minor changes for v0.1.0

* update tags

* update tags

* update integ test dependency version

* change err var name

* update go mod

* change input/output type comment

* minor change

---------

Co-authored-by: Tianyi Wang <[email protected]>

rebase from main

rebase branch from main
move transfer manager v2 integration tests to within module

move putobject integ test to transfer manager module
add getobject integ test

var output *GetObjectOutput
if g.options.MultipartDownloadType == types.MultipartDownloadTypePart {
if g.in.Range != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getObjectParts / getObjectRanges? You've got two pretty big blocks here.

@@ -16,7 +19,7 @@ type Options struct {
MultipartUploadThreshold int64

// Option to disable checksum validation for download
DisableChecksum bool
DisableChecksumValidation bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really more complicated than just a bool now with the recent updates to flex checksums. We should probably promote this into the new RequestChecksumCalculation enum we have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This client field was listed in SEP as boolean, but we can change that

PartBodyMaxRetries int

// Logger to send logging message to
Logger logging.Logger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we hold off on adding this for the moment? We might want to evolve our logging API along the otel path in the future.


chunk.cur = 0

g.options.Logger.Logf(logging.Debug,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these logs are necessary.

}

// Queue the next range of bytes to read.
ch <- getChunk{w: g.w, start: g.pos - g.offset, withRange: g.byteRange()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So as far as I can tell, you're now doing concurrent downloads but you're buffering the entire object in memory before returning it to the caller - which is all well and good until someone goes to try and pull a multi-gigabyte object.

We're basically going to have to return a reader that "composes" the response bodies from inner getPart requests in order. That will allow you to stream the response back to the caller without that extra memory pressure. You might have a "scratch" buffer for that internal composition but naturally it won't be the full object size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants