Skip to content
This repository has been archived by the owner on Oct 30, 2024. It is now read-only.

add: commands to export datasets and to import/query them #38

Merged
merged 5 commits into from
Jul 1, 2024

Conversation

iwilltry42
Copy link
Collaborator

@iwilltry42 iwilltry42 commented Jul 1, 2024

  • knowledge export <dataset> --output foo.zip
  • knowledge retrieve -d <dataset> --archive foo.zip "some question"
  • knowledge list-datasets --archive foo.zip
  • knowledge get-dataset <dataset> --archive foo.zip
  • knowledge import foo.zip

Depends On (currently using fork): philippgille/chromem-go#88

NOTE: there are quite a few rough edges and missing pieces, e.g. the server-mode version of this is not yet implemented.. we'll follow up on this in the future!

Copy link
Contributor

@StrongMonkey StrongMonkey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 🚀

So is it expected that when using archive, user shouldn't be ingesting files since those file ingestion won't be made into archive anyway?

if _, err := io.Copy(f, rc); err != nil {
return nil, err
}
_ = f.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like f and rc shoud be closed with defer so no longer need to close it here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm closing them right there to prevent possible resource leaks while defer'ing within the for-loop (pretty unlikely with only two expected files, but still 😬

@iwilltry42
Copy link
Collaborator Author

So is it expected that when using archive, user shouldn't be ingesting files since those file ingestion won't be made into archive anyway?

Yeah, if you want to "update" an archive, you would do import -> ingest -> export. That's essentially, because the archive doesn't contain information about the embeddings function.
I guess we could somehow serialize it with a very limited parameter set that makes sense for our setup, but I'm not sure it's worth the effort.

@iwilltry42 iwilltry42 merged commit 0a0f13c into main Jul 1, 2024
1 check passed
@iwilltry42 iwilltry42 deleted the feat/export branch July 1, 2024 18:02
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants