Before the section of homework:
Here you can find Kuei-Yang's and Ya-Chi's homepages via the links above, and also a blog for discussing the learning of R, which includes detailed procedures how we approach the projects/homeworks.
Section of homework begins here:
1. Explain in 1 or 2 sentences what the numbers (ie. mRNA expression intensities) in the matrix are measuring in the molecular biology level. Include 'mRNA', 'gene', and 'cell' in the answer.
Basically the microarray quantifies the mRNA levels which are transcribed from genes in cells
2.1) Generate diagnostic postscript plots for a qualitative assessment of array quality
Since we have unresolved problems to generate postscript files by using native maQualityPlots(,dev="postscript"), we output the files as JPEG format as following,
| diagPlot.6Hs.166.jpeg | diagPlot.6Hs.168.jpeg | diagPlot.6Hs.187.jpe |
| diagPlot.6Hs.194.jpeg | diagPlot.6Hs.195.jpeg | diagPlot.6Hs.243.jpeg |
2.2) Perform some normalization of the data
We used maNorm() to do normalization; the MA plot after normalization have been generated when using maQualityPlots(), which output a whole bunch of diagnostic plots.
2.3) Output your normalized gene expression to a text file
Then followed command of write.marray() dump the normalized data to a text file (check here)
2.4) Set-up a website and deposit all of your plots and text file
Here is the website you reached!
Interpretation & conclusion:
We will use the data from one of the array to go through meaning of each charts. The function of maQualityPlots() produces several plots for us to evaluate the quality of hybridization/microarray. For the two channeled array, the first and most obvious way to elucide the data is the MA plot, which are seperately calculated from green and red intensities. In a really ideal situation without any bias, samlples labeled with either green or red dyes should have comparable levels on the array (1:1), which makes M close to zero and A to one. However, things are not always so perfect; usually the red dye will have a little bit stronger intensity than green one due to its nature. From the MA plot below, we can see the down-shifted loess curves and spot distribution represented the unequal binding/labeling of two dyes. The tendency of loess curve is usually an indicator of the amount of normalization to be performed.
Fig.1 MA-plot of raw array data![]() |
The normalization process can correct this problem. In this plot, the light yellow color indicates high density of dots, while blue color represents the lower density. Thus it gives us information on the bulk of the data intensity (low/high signal)
Fig. 2 MA-plot of normalized data density![]() |
Spatial plot of rank of raw M values (no background subtraction)(Fig.3) is a convenient way to visualize uneven hybridization and missing spots. Each spot is colored according to the rank sorted by its M value. The program uses a blue to yellow color scale; the blue represents the higher rank (say the 1st), and the yellow represents the lower one. Missing spots are represented as white squares. Likewise spatial plot of normalized M values ranks (Fig.4) is to visualize the effect after normalization, and help helps to evaluate that normalization removed any spatial effects. By default, print-tip loess normalization is used in maQualityPlots(). The meaning of yellow/blue scale and white are as same as it indicated in Fig.3. In addition, flagged spots are higllighted by a black square.
| Fig. 3 & 4 |
In spatial plot of raw A values (Fig.5). The color indicates the strength of the signal intensity; the darker the blue, the stronger the signal. Missing spots are represented in white. Like spatial plot of raw M values (Fig.3), this plot gives us a idea about uneven hybridization and missing spots, but not so dramatic/obvious as in spatial plot of raw M values.
| Fig.5 Spatial plot of raw A values |
Histogram of the signal-to-noise log-ratio (SNR) for Cy5 and Cy3 channels demonstrates the mean and the variance of the signal printed on top of the histogram (Fig. 6 & 7 respectively for Cy5 & Cy3). Overlay density of SNR stratified by different control types (status) are highlighted, and the color schemes are listed in Table 1. The SNR is a good indicator to elucidate the dye problems. The negative and empty controls density lines should be very close and almost superimposed.
| Table 1 | |
| Postive control | Red |
| Empty control | Blue |
| Negative control | Navy Blue |
| Probes | Green |
| Missing spots | White |
| Fig. 6 & 7 |
Dot plot of controls normalized M values (Fig.8). Controls with more than 3 replicates are represented on the Y-axis, the color scheme is listed in Table 1. Controls M values should be tight and close to zero.
Dot plot of controls A values, without background subtraction (Fig.9) is organized like those in Fig.8, but shows A values without background subtraction. Intensity of positive controls should be in the high-intensity region, negative and empty controls should be in the lower intensity region. Positive controls range and negative/empty controls range should be separated.
| Fig. 8 & 9 |





