created by Geraldine_VdAuwera
on 2017-09-30
To run a workflow on a set of data entities, you just need to do to things: define your set, and use an expression in the Launch dialog to tell FireCloud how to handle the contents of the set. Note that the example below uses a sample set because that's a very common use case, but it can be trivially adapted to any other kind of set.
You define your sample set in the Data tab by importing what is essentially a table of samples and the sample set they belong to. For an example of what that looks like, see this public workspace's Data tab: https://portal.firecloud.org/#workspaces/help-firecloud/FireCloud101-Basics/data
If you click on "Download 'sample_set' metadata", you'll get a zip archive containing two files: sample_set_entity.tsv
and sample_set_membership.tsv
. Disregard the former; you'll see the latter describes the set of samples by listing, on each line, a sample set ID and a sample that belongs to it. It looks like this:
| membership:samplesetid | sample | |:--------------------------------|:---------| | CEUTriowgs20 | NA12878wgs20 | | CEUTriowgs20 | NA12877wgs20 | | CEUTriowgs20 | NA12882wgs20 |
All you need to do to define your sample set is modify this file (or generate one like it) with your sample set and sample IDs. The sample set ID can be any arbitrary name; the sample IDs must be IDs of samples you have already imported into the workspace. You can define multiple sample sets within the same file, and a sample can belong to multiple sample sets, so you can do this for example:
| membership:samplesetid | sample | |:--------------------------------|:---------| | CEUTriowgs20 | NA12878wgs20 | | CEUTriowgs20 | NA12877wgs20 | | CEUTriowgs20 | NA12882wgs20 | | CEUTriotest | NA12877wgs20 | | CEUTriotest | NA12882wgs20 |
Once you've made your TSV file describing your sample set(s), you import it by clicking the "Import Metadata..." button (still in your workspace's Data tab). This opens a dialog; follow the instructions to select the TSV file you created or modified, and assuming you don't hit any errors, once you close the dialog you'll see there is now a "sample_set" tab next to "participant" and "sample". If you click on it you can verify that your sample set has been created correctly.
To run a method on your newly created sample set, you don't need to change your method configuration. After clicking "Launch Analysis...", the dialog opens on the sample
list; just switch to the sample_set
list that should now appear, and select your sample set.
At this point, the trick is that you can't just hit "Launch" right away; first you need to use an expression to tell FireCloud how to deal with the fact that, instead of the single sample it's expecting based on the method config, you're giving it a list of samples. In this case, the expression is this.samples
. Then you can hit "Launch".
Note the plural in the expression; if you leave out the s
it won't work. Yes, it's annoying, and no it's not well documented yet... this is something we're working on improving.
Updated on 2017-09-30