Does my data have to live in a Google bucket to be accessible to FireCloud

IMPORTANT: This is the legacy GATK documentation. This information is only valid until Dec 31st 2019. For latest documentation and forum click here

created by Tiffany_at_Broad

on 2017-12-14

Technically yes, although it does not need to live in the Google bucket owned by the workspace you are working in. Data copies (both explicit and through workspace cloning) are shallow: references are copied rather than creating new instances of the data objects. This leads to workspaces containing references to data in buckets external to the workspace.

Note, however, that the results of any analysis run within a workspace will be written to the bucket attached to the workspace.

From ilya_at_vivid on 2019-04-12

I want to explore this topic a little bit more. I have my own bucket in a GCP organization.

https://cloud.google.com/resource-manager/docs/cloud-platform-resource-hierarchy#organizations

My understanding is that there is no way to mount a bucket (or part of) to a Firecloud/Terra workspace?

Since we want to keep our data inside the organizations bucket, is the recommended best practice to use gsutil to move the data from the workspace bucket (outside the organization) to our bucket inside the organization?

Are there any plans to enable the ability to mount any valid GCP bucket to a workspace?

Thanks,

Ilya