Probe set mask or inclusion file

In the “Analysis/Open Group/Other information” dialog, we can specify an Affymetrix “probe set mask file” (*.msk file; or a tab-delimited text file with the first column of each line being the probe set name) to exclude some probe sets from the analysis. These probe sets will be handled as if they do not exist on the chip, by marking their CELs as “QC” (quality control). Thus their CELs are not used for the CEL-level normalization, and they do not enter any downstream analysis. Make sure to check “Ignore existing DCP file” and “Ignore existing cdf.bin file” to re-extract CEL and CDF files whenever you specify a probe set mask file or change its content. If no TXT files exist, dChip will re-compute present/absent calls as well excluding the masked probes.

When using the probe set mask file, be sure to extract DCP files anew from CEL and TXT files (by checking “Ignore existing DCP file” and uncheck “Data file type/DCP file”). This is because the existing DCP file has format corresponding to the original CDF file, and thus cannot be used with the new CDF file with some probe sets masked. By the same token, do not combine the DCP files using the original CDF file and the DCP file using CDF with mask file in the same group.

[V2/23/05+] The probe set mask file can also accept individual probes to mask them out from CDF file. Such probes may tend to cross-hybridize as found by other means. An example file is HG-U133A mask file.txt (Edited in Excel but save as tab-delimited text file), where in the 2nd column, “all” or empty or “1-x” (x is the maximal number of probe pairs in the probe set) means to mask out the whole probe set, and a probe number between “|” (such as “|10|”) is used to mask out individual probes. The probe numbers start from 1 and follow the same order as in the CDF files. Make sure not to have space in a line (e.g. use "1|2|5" instead of “1 | 2 | 5”). Alternatively, a custom CDF file may be used for the similar purposes.

[V2/28/07+] If the probe set mask file has "Including" in the first line, the file will be regarded as a probe set inclusion file. Only the probe sets in this file will be included in the cdf.bin file and used in downstream analysis including normalization. This is useful if only a subset of probe sets have targets in hybridization cocktail.