BYOB (Secure storage connector)
Bring your own bucket (BYOB) allows you to store W&B artifacts and other related sensitive data in your own cloud or on-prem infrastructure. In case of Dedicated Cloud or SaaS Cloud, data that you store in your bucket is not copied to the W&B managed infrastructure.
- Communication between W&B SDK / CLI / UI and your buckets occurs using pre-signed URLs.
- W&B uses a garbage collection process to delete W&B Artifacts. For more information, see Deleting Artifacts.
Configuration options
There are two scopes you can configure your storage bucket to: at the Instance level or at a Team level.
- Instance level: Any user that has relevant permissions within your organization can access files stored in your instance level storage bucket.
- Team level: Members of a W&B Team can access files stored in the bucket configured at the Team level. Team level storage buckets allow greater data access control and data isolation for teams with highly sensitive data or strict compliance requirements.
You can configure your bucket at both the instance level and separately for one or more teams within your organization.
For example, suppose you have a team called Kappa in your organization. Your organization (and Team Kappa) use the Instance level storage bucket by default. Next, you create a team called Omega. When you create Team Omega, you configure a Team level storage bucket for that team. Files generated by Team Omega are not accessible by Team Kappa. However, files created by Team Kappa are accessible by Team Omega. If you want to isolate data for Team Kappa, you must configure a Team level storage bucket for them as well.
Team level storage bucket provides the same benefits for Self-Managed instances, especially when different business units and departments share an instance to efficiently utilize the infrastructure and administrative resources. This also applies to firms that have separate project teams managing AI workflows for separate customer engagements.
Availability matrix
The following table shows the availability of BYOB across different W&B Server deployment types. An X
means the feature is available on the specific deployment type.
W&B Server deployment type | Instance level | Team level | Additional information |
---|---|---|---|
Dedicated Cloud | X | X | Both the instance and team level BYOB are available for Amazon Web Services, Google Cloud Platform and Microsoft Azure. For the team-level BYOB, you can connect to a cloud-native storage bucket in the same or another cloud, or even a S3-compatible secure storage like MinIO hosted in your cloud or on-prem infrastructure. |
SaaS Cloud | Not Applicable | X | The team level BYOB is available only for Amazon Web Services and Google Cloud Platform. W&B fully manages the default and only storage bucket for Microsoft Azure. |
Self-managed | X | X | Instance level BYOB is the default since the instance is fully managed by you. If your self-managed instance is in cloud, you can connect to a cloud-native storage bucket in the same or another cloud for the team-level BYOB. You can also use S3-compatible secure storage like MinIO for either of instance or team-level BYOB. |
Cross-cloud or S3-compatible storage for team-level BYOB
You can connect to a cloud-native storage bucket in another cloud or to an S3-compatible storage bucket like MinIO for team-level BYOB in your Dedicated Cloud or Self-Managed instance.
To enable the use of cross-cloud or S3-compatible storage, specify the storage bucket including the relevant access key in one of the following formats, using the GORILLA_SUPPORTED_FILE_STORES
environment variable for your W&B instance.
Specify the path using the following format:
s3://<accessKey>:<secretAccessKey>@<url_endpoint>/<bucketName>?region=<region>
The region
parameter is mandatory, except for when your W&B instance is in AWS and the AWS_REGION
configured on the W&B instance nodes matches the region configured for the S3-compatible storage.
Specify the path in a format specific to the locations of your W&B instance and storage bucket:
From W&B instance in GCP or Azure to a bucket in AWS:
s3://<accessKey>:<secretAccessKey>@<s3_regional_url_endpoint>/<bucketName>
From W&B instance in GCP or AWS to a bucket in Azure:
az://:<urlEncodedAccessKey>@<storageAccountName>/<containerName>
From W&B instance in AWS or Azure to a bucket in GCP:
gs://<serviceAccountEmail>:<urlEncodedPrivateKey>@<bucketName>
Connectivity to S3-compatible storage for team-level BYOB is not available in SaaS Cloud. Also, connectivity to an AWS bucket for team-level BYOB is considered cross-cloud in SaaS Cloud, as that instance is hosted in GCP. That cross-cloud connectivity doesn't use the access key and environment variable based mechanism as outlined above for Dedicated Cloud and Self-Managed instances.
Reach out to W&B Support at support@wandb.com for more information.
Cloud storage in same cloud as W&B platform
Based on your use case, configure a storage bucket at the team or instance level. How a storage bucket is provisioned or configured is the same irrespective of the level it's configured at, except for the access mechanism in Azure.
- AWS
- GCP
- Azure
Provision the KMS Key
W&B requires you to provision a KMS Key which is needed to encrypt and decrypt the data on the S3 bucket. The key usage type must be ENCRYPT_DECRYPT
. Assign the following policy to the key:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid" : "Internal",
"Effect" : "Allow",
"Principal" : { "AWS" : "<Your_Account_Id>" },
"Action" : "kms:*",
"Resource" : "<aws_kms_key.key.arn>"
},
{
"Sid" : "External",
"Effect" : "Allow",
"Principal" : { "AWS" : "arn:aws:iam::<W&B_Platform_Account_Id>:root" },
"Action" : [
"kms:Decrypt",
"kms:Describe*",
"kms:Encrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*"
],
"Resource" : "<aws_kms_key.key.arn>"
}
]
}
Replace <Your_Account_Id>
, W&B_Platform_Account_Id
and <aws_kms_key.key.arn>
accordingly.
This policy grants your AWS account full access to the key and also assigns the required permissions to the AWS account hosting the W&B Platform. Keep a record of the KMS Key ARN.
Provision the S3 Bucket
Follow these steps to provision the S3 bucket in your AWS account:
- Create the S3 bucket with a name of your choice.
- Enable bucket versioning.
- Enable server side encryption, using the KMS key from the previous step.
- Configure CORS with the following policy:
[
{
"AllowedHeaders": [
"*"
],
"AllowedMethods": [
"GET",
"HEAD",
"PUT"
],
"AllowedOrigins": [
"*"
],
"ExposeHeaders": [
"ETag"
],
"MaxAgeSeconds": 3600
}
]
- Grant the required S3 permissions to the AWS account hosting the W&B Platform. These permissions are used to generate pre-signed URLs that AI workloads in your cloud infrastructure or user browsers utilize to access the bucket.
{
"Version": "2012-10-17",
"Id": "WandBAccess",
"Statement": [
{
"Sid": "WAndBAccountAccess",
"Effect": "Allow",
"Principal": { "AWS": "arn:aws:iam::830241207209:root" },
"Action" : [
"s3:GetObject*",
"s3:GetEncryptionConfiguration",
"s3:ListBucket",
"s3:ListBucketMultipartUploads",
"s3:ListBucketVersions",
"s3:AbortMultipartUpload",
"s3:DeleteObject",
"s3:PutObject",
"s3:GetBucketCORS",
"s3:GetBucketLocation",
"s3:GetBucketVersioning"
],
"Resource": [
"arn:aws:s3:::<wandb_bucket>",
"arn:aws:s3:::<wandb_bucket>/*"
]
}
]
}
Replace <wandb_bucket>
accordingly. Keep a record of the bucket name.
Provision the GCS Bucket
Follow these steps to provision the GCS bucket in your GCP project:
Create the GCS bucket with a name of your choice.
Enable soft deletion.
Enable object versioning.
Set encryption type to
Google-managed
.Set the CORS policy with
gsutil
. This is not possible in the UI.Create a file called
cors-policy.json
locally.Copy the following CORS policy into the file and save it.
[
{
"origin": ["*"],
"responseHeader": ["Content-Type"],
"exposeHeaders": ["ETag"],
"method": ["GET", "HEAD", "PUT"],
"maxAgeSeconds": 3600
}
]Replace
<bucket_name>
with the correct bucket name and rungsutil
.gsutil cors set cors-policy.json gs://<bucket_name>
Verify the policy was attached to the bucket. Replace
<bucket_name>
with the correct bucket name.gsutil cors get gs://<bucket_name>
Grant the
Storage Admin
role to the GCP service account linked to the W&B Platform. Reach out to your W&B team for the service account if your W&B Platform is on Dedicated Cloud.
Keep a record of the bucket name.
Provision the Azure Blob Storage
This section is only relevant for instance level BYOB. To configure team level BYOB for W&B platform in Azure, refer to this repository.
Follow these steps to provision the Azure Blob Storage in your Azure subscription:
Create a bucket with a name of your choice.
Enable blob and container soft deletion.
Enable versioning.
Configure the CORS policy on the bucket
To set the CORS policy through the UI go to the blob storage, scroll down to
Settings/Resource Sharing (CORS)
and then set the following:Parameter Value Allowed Origins * Allowed Methods GET, HEAD, PUT Allowed Headers * Exposed Headers * Max Age 3600
Generate a storage account access key, and keep a record of that along with the storage account name.
Configure BYOB in W&B
- Team level
- Instance level
If you're connecting to a cloud-native storage bucket in another cloud or to an S3-compatible storage bucket like MinIO for team-level BYOB in your Dedicated Cloud or Self-Managed instance, refer to Cross-cloud or S3-compatible storage for team-level BYOB. In such cases, you must specify the storage bucket using the GORILLA_SUPPORTED_FILE_STORES
environment variable for your W&B instance, before you configure it for a team using the instructions below.
Configure a storage bucket at the team level when you create a W&B Team:
- Provide a name for your team in the Team Name field.
- Choose the Company or Organization you want this team to belong to from the Company/Organization dropdown.
- Select External Storage for the Choose storage type option.
- Choose either New bucket from the dropdown or select an existing bucket.
Multiple W&B Teams can use the same cloud storage bucket. To enable this, select an existing cloud storage bucket from the dropdown.
- From the Cloud provider dropdown, select your cloud provider.
- Provide the name of your storage bucket for the Name field.
- (Optional if you use AWS) Provide the ARN of your encryption key for the KMS key ARN field.
- Press the Create Team button.
An error or warning appears at the bottom of the page if there are issues accessing the bucket or the bucket has invalid settings.
Reach out to W&B Support at support@wandb.com to configure instance level BYOB for your Dedicated Cloud or Self-managed instance.