Storage Locations in Synapse¶
Storage locations allow you to configure where files uploaded to Synapse are stored. By default, files are stored in Synapse's internal S3 storage, but you can configure projects or folders to use your own AWS S3 buckets, Google Cloud Storage buckets, or other external storage.
This tutorial demonstrates how to use the Python client to manage storage locations using the new object-oriented models.
Read more about Custom Storage Locations
Tutorial Purpose¶
In this tutorial you will:
- Create an external S3 storage location and assign it to a folder
- Create a Google Cloud Storage location and assign it to a folder
- Create an SFTP storage location and assign it to a folder
- Create an HTTPS storage location and assign it to a folder
- Create an External Object Store location and assign it to a folder
- Create a Proxy storage location, register a proxy file handle, and assign it to a folder
- Retrieve and inspect storage location settings
- Update a storage location (create a replacement and reassign)
- Index and migrate files to a new storage location
Prerequisites¶
- Make sure that you have completed the Installation and Authentication setup.
- You must have a Project created and replace the one used in this tutorial.
- An AWS S3 bucket properly configured for use with Synapse, including an
owner.txtfile. See Custom Storage Locations. - (Optional)
boto3installed for STS credential examples. - For SFTP:
pysftpinstalled (pip install "synapseclient[pysftp]"). - For Object Store: AWS credentials configured in your environment.
- For Proxy: a running proxy server and its shared secret key.
Understanding Storage Location Types¶
Synapse supports several types of storage locations:
- SYNAPSE_S3: Synapse-managed S3 storage (default)
- EXTERNAL_S3: User-owned AWS S3 bucket, accessed by Synapse on
your behalf. Synapse transfers the data for uploads and downloads. Requires
an
owner.txtfile in the bucket to verify ownership. - EXTERNAL_GOOGLE_CLOUD: User-owned Google Cloud Storage bucket
- EXTERNAL_SFTP: External SFTP server
- EXTERNAL_HTTPS: External HTTPS server (uploading via client is not supported right now.)
- EXTERNAL_OBJECT_STORE: An S3-compatible store (e.g., MinIO, OpenStack Swift) that Synapse does not access. The client transfers data directly to the object store using credentials configured in your environment; Synapse only stores the file metadata.
- PROXY: A proxy server that controls access to the underlying storage
Storage Location Settings¶
Each storage type exposes a different set of configuration fields on
StorageLocation. When you retrieve a stored location, only the fields
relevant to its type are populated:
| Type | Key fields |
|---|---|
SYNAPSE_S3 |
base_key, sts_enabled |
EXTERNAL_S3 |
bucket, base_key, sts_enabled, endpoint_url |
EXTERNAL_GOOGLE_CLOUD |
bucket, base_key |
EXTERNAL_SFTP / EXTERNAL_HTTPS |
url, supports_subfolders |
EXTERNAL_OBJECT_STORE |
bucket, endpoint_url |
PROXY |
proxy_url, secret_key, benefactor_id |
Common attributes are: concrete_type, storage_location_id, storage_type, upload_type, banner, description, etag, created_on, created_by
Data Migration Between Storage Locations¶
Files in a project or folder can be migrated from one storage location to another using
index_files_for_migration followed by migrate_indexed_files. Migration is
currently supported only between S3 storage locations (both Synapse-managed
SYNAPSE_S3 and external EXTERNAL_S3) that reside in the same AWS
region.
Migration is a two-phase process:
- Index — scan the project/folder and record every file that needs to move into a local SQLite database.
- Migrate — read the index database and move each file to the destination storage location.
Separating the phases lets you review what will be migrated before committing to the move.
1. Set up and get project¶
import asyncio
import hashlib
import json
import os
import synapseclient
from synapseclient.models import (
File,
Folder,
Project,
StorageLocation,
StorageLocationType,
)
syn = synapseclient.login()
2. Create an external S3 storage location¶
Create a storage location backed by your own S3 bucket. The bucket must be
properly configured with an owner.txt file. Synapse will transfer data
directly to and from this bucket on the user's behalf.
# Step 1: Retrieve the project
my_project = Project(name="My uniquely named project about Alzheimer's Disease").get()
# Step 2: Create an External S3 Storage Location that in the same region as the current storage location
# Replace with your S3 bucket name (must have owner.txt configured)
MY_BUCKET_NAME = "my-synapse-bucket"
MY_BASE_KEY = "synapse-data"
external_s3_storage_location = StorageLocation(
storage_type=StorageLocationType.EXTERNAL_S3,
bucket=MY_BUCKET_NAME,
base_key=MY_BASE_KEY,
description="External S3 storage location",
).store()
You'll notice the output looks like:
Created storage location: 12345
storage location type: StorageLocationType.EXTERNAL_S3
3. Set up a folder with external S3 storage¶
Create a folder and assign it the S3 storage location. All files uploaded into this folder will be stored in your S3 bucket.
external_s3_folder = Folder(name="my-folder-for-external-s3", parent_id=my_project.id)
external_s3_folder = external_s3_folder.store()
# Set the storage location for the folder
external_s3_folder.set_storage_location(
storage_location_id=external_s3_storage_location.storage_location_id
)
external_s3_folder_storage_location = external_s3_folder.get_project_setting()
# Verify the storage location is set correctly
assert (
external_s3_folder_storage_location["locations"][0]
== external_s3_storage_location.storage_location_id
), "Folder storage location does not match the storage location"
4. Create a Google Cloud Storage location¶
Create a storage location backed by a Google Cloud Storage bucket and assign it to a folder.
MY_GCS_BUCKET = "my-gcs-bucket"
MY_GCS_BASE_KEY = "synapse-data"
gcs_storage = StorageLocation(
storage_type=StorageLocationType.EXTERNAL_GOOGLE_CLOUD,
bucket=MY_GCS_BUCKET,
base_key=MY_GCS_BASE_KEY,
description="External Google Cloud Storage location",
).store()
print(f"Created GCS storage location: {gcs_storage.storage_location_id}")
print(f"storage location type: {gcs_storage.storage_type}")
gcs_folder = Folder(name="my-folder-for-gcs", parent_id=my_project.id)
gcs_folder = gcs_folder.store()
# Set the storage location for the folder
gcs_folder.set_storage_location(storage_location_id=gcs_storage.storage_location_id)
gcs_folder_storage_location = gcs_folder.get_project_setting()
# Verify the storage location is set correctly
assert (
gcs_folder_storage_location["locations"][0] == gcs_storage.storage_location_id
), "Folder storage location does not match the storage location"
5. Create an SFTP storage location¶
SFTP storage locations point to an external SFTP server, where files are stored outside of Synapse. Synapse only manages the metadata and does not handle the file transfer itself. This setup requires the pysftp package, and files must be uploaded separately through the client once configured.
MY_SFTP_URL = "sftp://your-sftp-server.example.com/upload"
sftp_storage = StorageLocation(
storage_type=StorageLocationType.EXTERNAL_SFTP,
url=MY_SFTP_URL,
supports_subfolders=True,
description="External SFTP server",
).store()
print(f"Created SFTP storage location: {sftp_storage.storage_location_id}")
print(f"storage location type: {sftp_storage.storage_type}")
sftp_folder = Folder(name="my-folder-for-sftp", parent_id=my_project.id)
sftp_folder = sftp_folder.store()
# Set the storage location for the folder
sftp_folder.set_storage_location(storage_location_id=sftp_storage.storage_location_id)
sftp_folder_storage_location = sftp_folder.get_project_setting()
# Verify the storage location is set correctly
assert (
sftp_folder_storage_location["locations"][0] == sftp_storage.storage_location_id
), "Folder storage location does not match the storage location"
# Add a file to the sftp folder, need 'pysftp' package installed.
file = File(path="/path/to/your/file.csv", parent_id=sftp_folder.id)
file = file.store()
6. Create an HTTPS storage location¶
EXTERNAL_HTTPS uses the same underlying API type as EXTERNAL_SFTP but is
used when the external server is accessed over HTTPS. Note that the Python
client does NOT support uploading files to HTTPS storage locations directly yet. To add files, use the Synapse REST API directly.
my_https_folder = Folder(name="my-folder-for-https", parent_id=my_project.id)
my_https_folder = my_https_folder.store()
my_https_url = "https://my-https-server.example.com"
https_storage = StorageLocation(
storage_type=StorageLocationType.EXTERNAL_HTTPS,
url=my_https_url,
description="External HTTPS server",
).store()
print(f"Created HTTPS storage location: {https_storage.storage_location_id}")
print(f"storage location type: {https_storage.storage_type}")
my_https_folder.set_storage_location(
storage_location_id=https_storage.storage_location_id
)
my_https_folder_storage_location = my_https_folder.get_project_setting()
assert (
my_https_folder_storage_location["locations"][0]
== https_storage.storage_location_id
), "Folder storage location does not match the storage location"
7. Create an External Object Store storage location¶
Use EXTERNAL_OBJECT_STORE for S3-compatible stores that are not directly
accessed by Synapse. Unlike EXTERNAL_S3, the Python client transfers data
directly to the object store using locally configured AWS credentials —
Synapse is never involved in the data transfer, only in storing the metadata.
You can add a profile to work with s3 in ~/.synapseConfig
Add a section matching your endpoint+bucket URL:
[https://s3.us-east-1.amazonaws.com/test-external-object-store]
profile_name = my-s3-profile
~/.aws/config with valid keys:
[my-s3-profile]
aws_access_key_id = ...
aws_secret_access_key = ...
MY_OBJECT_STORE_BUCKET = "test-external-object-store"
MY_OBJECT_STORE_ENDPOINT_URL = "https://s3.us-east-1.amazonaws.com"
object_store_storage = StorageLocation(
storage_type=StorageLocationType.EXTERNAL_OBJECT_STORE,
bucket=MY_OBJECT_STORE_BUCKET,
endpoint_url=MY_OBJECT_STORE_ENDPOINT_URL,
description="External S3-compatible object store",
).store()
print(f"Created object store location: {object_store_storage.storage_location_id}")
print(f"storage location type: {object_store_storage.storage_type}")
# create a folder with the object store storage location
object_store_folder = Folder(name="my-folder-for-object-store", parent_id=my_project.id)
object_store_folder = object_store_folder.store()
object_store_folder.set_storage_location(
storage_location_id=object_store_storage.storage_location_id
)
object_store_folder_storage_location = object_store_folder.get_project_setting()
assert (
object_store_folder_storage_location["locations"][0]
== object_store_storage.storage_location_id
), "Folder storage location does not match the storage location"
# Add a file to the object store folder.
# Requires AWS credentials (access key and secret key) configured in your environment.
file = File(path="/path/to/your/file.csv", parent_id=object_store_folder.id)
file = file.store()
8. Create a Proxy storage location¶
Proxy storage locations delegate file access to a proxy server that controls
authentication and access to the underlying storage. Files are registered by
creating a ProxyFileHandle via the REST API. Then, files can be uploaded via store function with data_file_handle_id.
my_proxy_folder = Folder(name="my-folder-for-proxy", parent_id=my_project.id)
my_proxy_folder = my_proxy_folder.store()
MY_PROXY_URL = "https://my-proxy-server.example.com"
proxy_storage = StorageLocation(
storage_type=StorageLocationType.PROXY,
proxy_url=MY_PROXY_URL,
secret_key=my_proxy_secret_key,
benefactor_id=my_project.id,
description="Proxy-controlled storage",
).store()
print(f"Created proxy storage location: {proxy_storage.storage_location_id}")
print(f" Proxy URL: {proxy_storage.proxy_url}")
print(f" Benefactor ID: {proxy_storage.benefactor_id}")
my_proxy_folder.set_storage_location(
storage_location_id=proxy_storage.storage_location_id
)
my_proxy_folder_storage_location = my_proxy_folder.get_project_setting()
assert (
my_proxy_folder_storage_location["locations"][0]
== proxy_storage.storage_location_id
), "Folder storage location does not match the storage location"
# Add a file to the proxy folder, need a proxy file handle id
# Create ProxyFileHandle via REST API
file_path = "/path/to/your/file.csv"
with open(file_path, "rb") as f:
content_md5 = hashlib.md5(f.read(), usedforsecurity=False).hexdigest()
file_size = os.path.getsize(file_path)
async def create_proxy_file_handle():
proxy_file_handle = await syn.rest_post_async(
"/externalFileHandle/proxy",
body=json.dumps(
{
"concreteType": "org.sagebionetworks.repo.model.file.ProxyFileHandle",
"storageLocationId": proxy_storage.storage_location_id,
"filePath": "test.csv", # relative path served by your proxy
"fileName": "test.csv",
"contentType": "text/csv",
"contentMd5": content_md5,
"contentSize": file_size,
}
),
endpoint=syn.fileHandleEndpoint,
)
print(f"File handle ID: {proxy_file_handle['id']}")
return proxy_file_handle["id"]
proxy_file_handle_id = asyncio.run(create_proxy_file_handle())
# Associate the ProxyFileHandle with a Synapse File entity
proxy_file = File(
parent_id=my_proxy_folder.id,
name="test.csv",
data_file_handle_id=proxy_file_handle_id,
).store()
9. Retrieve and inspect storage location settings¶
You can retrieve a storage location by ID. Only fields relevant to the storage type are populated.
# Only fields that belong to the storage type are populated after retrieval.
retrieved_storage = StorageLocation(
storage_location_id=external_s3_storage_location.storage_location_id
).get()
print(f"Retrieved storage location ID: {retrieved_storage.storage_location_id}")
print(f"Storage type: {retrieved_storage.storage_type}")
print(f"Bucket: {retrieved_storage.bucket}")
You'll notice the output looks like:
Retrieved storage location ID: 12345
Storage type: StorageLocationType.EXTERNAL_S3
Bucket: my-synapse-bucket
Base key: synapse-data
10. Update a storage location¶
Storage locations are immutable — individual fields cannot be edited after creation. To "update" a storage location, create a new one with the desired settings and reassign it to the folder or project.
#
# Example: change the base key of the External S3 storage location used by
# external_s3_folder from MY_BASE_KEY to "synapse-data-v2".
updated_s3_storage_location = StorageLocation(
storage_type=StorageLocationType.EXTERNAL_S3,
bucket=MY_BUCKET_NAME,
base_key="synapse-data-v2",
description="External S3 storage location (updated base key)",
).store()
print(f"New storage location ID: {updated_s3_storage_location.storage_location_id}")
# Reassign the folder to point at the new storage location
external_s3_folder.set_storage_location(
storage_location_id=updated_s3_storage_location.storage_location_id
)
updated_folder_setting = external_s3_folder.get_project_setting()
assert (
updated_folder_setting["locations"][0]
== updated_s3_storage_location.storage_location_id
), "Folder storage location was not updated"
print("Folder now uses the updated storage location.")
# Step 10b: Partial update — add a storage location without removing existing ones
#
# `set_storage_location` is a destructive replacement. To append a new location
# while keeping the ones already configured, read the current ProjectSetting,
# append to its `locations` list, and call store() on the setting directly.
setting = external_s3_folder.get_project_setting()
if setting is not None:
setting.locations.append(gcs_storage.storage_location_id)
setting.store()
print(f"Updated locations after partial update: {setting.locations}")
11. Index and migrate files to a new storage location¶
Warning: This will migrate files associated with the folder. Run against a test project first and review the index result before migrating production data.
Phase 1. indexes all files that need to move into a local SQLite database. This will return a MigrationResults object. You can use the as_csv to check the details of indexing status.
# Phase 1: Index files for migration
my_migration_folder = Folder(
name="my-data-migration-folder", parent_id=my_project.id
).get()
index_result = my_migration_folder.index_files_for_migration(
dest_storage_location_id=external_s3_storage_location.storage_location_id,
db_path="/path/to/your/migration.db",
include_table_files=False, # Set True if you also want table-attached files
)
index_result.as_csv("/path/to/your/index_results.csv")
print(f"Migration index database: {index_result.db_path}")
Index results can be checked in the index results csv

Phase 2. reads that database and performs the actual migration. This will return a MigrationResults object. You can use the as_csv to check the details of migration status and errors if any.
# Phase 2: Migrate indexed files
migrate_result = my_migration_folder.migrate_indexed_files(
db_path="/path/to/your/migration.db",
continue_on_error=True,
force=True, # Skip interactive confirmation for tutorial purposes
)
migrate_result.as_csv("/path/to/your/migrate_results.csv")
if migrate_result is not None:
print(f"Migrated counts by status: {migrate_result.counts_by_status}")
else:
Currently, detailed Traceback is saved in the exception columns of the csv.

Source code for this tutorial¶
Click to show me
"""
Tutorial code for the Storage Location and project settings.
"""
import asyncio
import hashlib
import json
import os
import synapseclient
from synapseclient.models import (
File,
Folder,
Project,
StorageLocation,
StorageLocationType,
)
syn = synapseclient.login()
# Step 1: Retrieve the project
my_project = Project(name="My uniquely named project about Alzheimer's Disease").get()
# Step 2: Create an External S3 Storage Location that in the same region as the current storage location
# Replace with your S3 bucket name (must have owner.txt configured)
MY_BUCKET_NAME = "my-synapse-bucket"
MY_BASE_KEY = "synapse-data"
external_s3_storage_location = StorageLocation(
storage_type=StorageLocationType.EXTERNAL_S3,
bucket=MY_BUCKET_NAME,
base_key=MY_BASE_KEY,
description="External S3 storage location",
).store()
print(f"Created storage location: {external_s3_storage_location.storage_location_id}")
print(f"storage location type: {external_s3_storage_location.storage_type}")
# Step 3. Create a Folder with the new storage location
external_s3_folder = Folder(name="my-folder-for-external-s3", parent_id=my_project.id)
external_s3_folder = external_s3_folder.store()
# Set the storage location for the folder
external_s3_folder.set_storage_location(
storage_location_id=external_s3_storage_location.storage_location_id
)
external_s3_folder_storage_location = external_s3_folder.get_project_setting()
# Verify the storage location is set correctly
assert (
external_s3_folder_storage_location["locations"][0]
== external_s3_storage_location.storage_location_id
), "Folder storage location does not match the storage location"
# Step 4: Create a Google Cloud Storage location
MY_GCS_BUCKET = "my-gcs-bucket"
MY_GCS_BASE_KEY = "synapse-data"
gcs_storage = StorageLocation(
storage_type=StorageLocationType.EXTERNAL_GOOGLE_CLOUD,
bucket=MY_GCS_BUCKET,
base_key=MY_GCS_BASE_KEY,
description="External Google Cloud Storage location",
).store()
print(f"Created GCS storage location: {gcs_storage.storage_location_id}")
print(f"storage location type: {gcs_storage.storage_type}")
gcs_folder = Folder(name="my-folder-for-gcs", parent_id=my_project.id)
gcs_folder = gcs_folder.store()
# Set the storage location for the folder
gcs_folder.set_storage_location(storage_location_id=gcs_storage.storage_location_id)
gcs_folder_storage_location = gcs_folder.get_project_setting()
# Verify the storage location is set correctly
assert (
gcs_folder_storage_location["locations"][0] == gcs_storage.storage_location_id
), "Folder storage location does not match the storage location"
# Step 5: Create an SFTP storage location
MY_SFTP_URL = "sftp://your-sftp-server.example.com/upload"
sftp_storage = StorageLocation(
storage_type=StorageLocationType.EXTERNAL_SFTP,
url=MY_SFTP_URL,
supports_subfolders=True,
description="External SFTP server",
).store()
print(f"Created SFTP storage location: {sftp_storage.storage_location_id}")
print(f"storage location type: {sftp_storage.storage_type}")
sftp_folder = Folder(name="my-folder-for-sftp", parent_id=my_project.id)
sftp_folder = sftp_folder.store()
# Set the storage location for the folder
sftp_folder.set_storage_location(storage_location_id=sftp_storage.storage_location_id)
sftp_folder_storage_location = sftp_folder.get_project_setting()
# Verify the storage location is set correctly
assert (
sftp_folder_storage_location["locations"][0] == sftp_storage.storage_location_id
), "Folder storage location does not match the storage location"
# Add a file to the sftp folder, need 'pysftp' package installed.
file = File(path="/path/to/your/file.csv", parent_id=sftp_folder.id)
file = file.store()
# Step 6: Create an HTTPS storage location
# EXTERNAL_HTTPS shares the same underlying API type as EXTERNAL_SFTP but is used
# when the external server is accessed over HTTPS rather than SFTP.
my_https_folder = Folder(name="my-folder-for-https", parent_id=my_project.id)
my_https_folder = my_https_folder.store()
my_https_url = "https://my-https-server.example.com"
https_storage = StorageLocation(
storage_type=StorageLocationType.EXTERNAL_HTTPS,
url=my_https_url,
description="External HTTPS server",
).store()
print(f"Created HTTPS storage location: {https_storage.storage_location_id}")
print(f"storage location type: {https_storage.storage_type}")
my_https_folder.set_storage_location(
storage_location_id=https_storage.storage_location_id
)
my_https_folder_storage_location = my_https_folder.get_project_setting()
assert (
my_https_folder_storage_location["locations"][0]
== https_storage.storage_location_id
), "Folder storage location does not match the storage location"
# Note: The Python client does not support uploading files directly to HTTPS
# storage locations. To add files, use the Synapse web UI or REST API directly.
# Step 7: Create an External Object Store storage location
# Use this for S3-compatible stores (e.g. OpenStack Swift) not accessed by Synapse.
MY_OBJECT_STORE_BUCKET = "test-external-object-store"
MY_OBJECT_STORE_ENDPOINT_URL = "https://s3.us-east-1.amazonaws.com"
object_store_storage = StorageLocation(
storage_type=StorageLocationType.EXTERNAL_OBJECT_STORE,
bucket=MY_OBJECT_STORE_BUCKET,
endpoint_url=MY_OBJECT_STORE_ENDPOINT_URL,
description="External S3-compatible object store",
).store()
print(f"Created object store location: {object_store_storage.storage_location_id}")
print(f"storage location type: {object_store_storage.storage_type}")
# create a folder with the object store storage location
object_store_folder = Folder(name="my-folder-for-object-store", parent_id=my_project.id)
object_store_folder = object_store_folder.store()
object_store_folder.set_storage_location(
storage_location_id=object_store_storage.storage_location_id
)
object_store_folder_storage_location = object_store_folder.get_project_setting()
assert (
object_store_folder_storage_location["locations"][0]
== object_store_storage.storage_location_id
), "Folder storage location does not match the storage location"
# Add a file to the object store folder.
# Requires AWS credentials (access key and secret key) configured in your environment.
file = File(path="/path/to/your/file.csv", parent_id=object_store_folder.id)
file = file.store()
# Step 8: Create a Proxy storage location
# Use this when a proxy server controls access to the underlying storage.
my_proxy_folder = Folder(name="my-folder-for-proxy", parent_id=my_project.id)
my_proxy_folder = my_proxy_folder.store()
MY_PROXY_URL = "https://my-proxy-server.example.com"
proxy_storage = StorageLocation(
storage_type=StorageLocationType.PROXY,
proxy_url=MY_PROXY_URL,
secret_key=my_proxy_secret_key,
benefactor_id=my_project.id,
description="Proxy-controlled storage",
).store()
print(f"Created proxy storage location: {proxy_storage.storage_location_id}")
print(f" Proxy URL: {proxy_storage.proxy_url}")
print(f" Benefactor ID: {proxy_storage.benefactor_id}")
my_proxy_folder.set_storage_location(
storage_location_id=proxy_storage.storage_location_id
)
my_proxy_folder_storage_location = my_proxy_folder.get_project_setting()
assert (
my_proxy_folder_storage_location["locations"][0]
== proxy_storage.storage_location_id
), "Folder storage location does not match the storage location"
# Add a file to the proxy folder, need a proxy file handle id
# Create ProxyFileHandle via REST API
file_path = "/path/to/your/file.csv"
with open(file_path, "rb") as f:
content_md5 = hashlib.md5(f.read(), usedforsecurity=False).hexdigest()
file_size = os.path.getsize(file_path)
async def create_proxy_file_handle():
proxy_file_handle = await syn.rest_post_async(
"/externalFileHandle/proxy",
body=json.dumps(
{
"concreteType": "org.sagebionetworks.repo.model.file.ProxyFileHandle",
"storageLocationId": proxy_storage.storage_location_id,
"filePath": "test.csv", # relative path served by your proxy
"fileName": "test.csv",
"contentType": "text/csv",
"contentMd5": content_md5,
"contentSize": file_size,
}
),
endpoint=syn.fileHandleEndpoint,
)
print(f"File handle ID: {proxy_file_handle['id']}")
return proxy_file_handle["id"]
proxy_file_handle_id = asyncio.run(create_proxy_file_handle())
# Associate the ProxyFileHandle with a Synapse File entity
proxy_file = File(
parent_id=my_proxy_folder.id,
name="test.csv",
data_file_handle_id=proxy_file_handle_id,
).store()
print(f"Synapse entity: {proxy_file.id}")
# Step 9: Retrieve and inspect storage location settings
# Only fields that belong to the storage type are populated after retrieval.
retrieved_storage = StorageLocation(
storage_location_id=external_s3_storage_location.storage_location_id
).get()
print(f"Retrieved storage location ID: {retrieved_storage.storage_location_id}")
print(f"Storage type: {retrieved_storage.storage_type}")
print(f"Bucket: {retrieved_storage.bucket}")
print(f"Base key: {retrieved_storage.base_key}")
# Step 10: Update a storage location
#
# Storage locations are immutable in Synapse — individual fields cannot be edited
# after creation. To "update" a storage location, create a new one with the desired
# settings and reassign it to the folder or project.
#
# Example: change the base key of the External S3 storage location used by
# external_s3_folder from MY_BASE_KEY to "synapse-data-v2".
updated_s3_storage_location = StorageLocation(
storage_type=StorageLocationType.EXTERNAL_S3,
bucket=MY_BUCKET_NAME,
base_key="synapse-data-v2",
description="External S3 storage location (updated base key)",
).store()
print(f"New storage location ID: {updated_s3_storage_location.storage_location_id}")
# Reassign the folder to point at the new storage location
external_s3_folder.set_storage_location(
storage_location_id=updated_s3_storage_location.storage_location_id
)
updated_folder_setting = external_s3_folder.get_project_setting()
assert (
updated_folder_setting["locations"][0]
== updated_s3_storage_location.storage_location_id
), "Folder storage location was not updated"
print("Folder now uses the updated storage location.")
# Step 10b: Partial update — add a storage location without removing existing ones
#
# `set_storage_location` is a destructive replacement. To append a new location
# while keeping the ones already configured, read the current ProjectSetting,
# append to its `locations` list, and call store() on the setting directly.
setting = external_s3_folder.get_project_setting()
if setting is not None:
setting.locations.append(gcs_storage.storage_location_id)
setting.store()
print(f"Updated locations after partial update: {setting.locations}")
# Step 11: Index and migrate files to the new storage location
#
# WARNING: This will actually migrate files associated with the project/folder.
# Run against a test project first and review the index (MigrationResult) before
# migrating production data.
# Phase 1: Index files for migration
my_migration_folder = Folder(
name="my-data-migration-folder", parent_id=my_project.id
).get()
index_result = my_migration_folder.index_files_for_migration(
dest_storage_location_id=external_s3_storage_location.storage_location_id,
db_path="/path/to/your/migration.db",
include_table_files=False, # Set True if you also want table-attached files
)
index_result.as_csv("/path/to/your/index_results.csv")
print(f"Migration index database: {index_result.db_path}")
print(f"Indexed counts by status: {index_result.counts_by_status}")
# Phase 2: Migrate indexed files
migrate_result = my_migration_folder.migrate_indexed_files(
db_path="/path/to/your/migration.db",
continue_on_error=True,
force=True, # Skip interactive confirmation for tutorial purposes
)
migrate_result.as_csv("/path/to/your/migrate_results.csv")
if migrate_result is not None:
print(f"Migrated counts by status: {migrate_result.counts_by_status}")
else:
print("Migration was aborted (confirmation declined).")
References used in this tutorial¶
- StorageLocation
- StorageLocationType
- Folder
- File
- Project
- syn.login
- Custom Storage Locations Documentation
See also¶
- Storage Location Architecture - In-depth architecture diagrams and design documentation