Working with AI data stored in S3-compatible object storage
Suggest editsWe recommend you to prepare your own S3 compatible object storage bucket with some test data and try the steps in this section with that. But it is possible to simply use the example S3 bucket data as is in the examples here even with your custom access key and secret key credentials because these have been configured for public access.
In addition we use image data and an according image encoder LLM in this example instead of text data. But you could also use plain text data on object storage similar to the examples in the previous section.
First let's create a retriever for images stored on s3-compatible object storage as the source. We specify torsten as the bucket name and an endpoint URL where the bucket is created. We specify an empty string as prefix because we want all the objects in that bucket. We use the clip-vit-base-patch32
open encoder model for image data from HuggingFace. We provide a name for the retriever so that we can identify and reference it subsequent operations:
Next, run the refresh_retriever function.
Finally, run the retrieve_via_s3 function with the required parameters to retrieve the top K most relevant (most similar) AI data items. Be aware that the object type is currently limited to image and text files.
Could this page be better? Report a problem or suggest an addition!