In cluster analysis of microarray data– If Xi is the log odds value for gene X at time i, then for two genes X and Y and N observations, a similarity score is calculated. S(X,Y) is also known as the Pearson correlation coefficent. Xoffset and Yoffset can be the mean of the observations on X or Y, respectively, in which case is the standard deviation, or else Xoffset and Yoffset can be set to zero when a reference state is used. Which of the following best represents it?
(a) S(X,Y) = \(\frac{1}{N-2}\) ∑i=1,N . (Xi – Xoffset) (Yi + Yoffset)/ϕxQY
(b) S(X,Y) = \(\frac{1}{N}\) ∑i=1,N . (Xi – Xoffset) (Yi – Yoffset)/ϕxQY
(c) S(X,Y) = \(\frac{1}{N-1}\) ∑i=1,N . (Xi + Xoffset) (Yi + Yoffset)/ϕxQY
(d) S(X,Y) = \(\frac{1}{N}\) ∑i=1,N+2 . (Xi + Xoffset) (Yi – Yoffset)/ϕxQY
The question was asked in an internship interview.
This intriguing question comes from Global Gene Regulation in section Genome Analysis of Bioinformatics