A MATHEMATICAL MODEL FOR AFFYMETRIX GENECHIP PROBE LEVEL DATA
The Affymetrix GeneChip (PM, MM) probe pair is designed with the intention of measuring non-specific binding. Although the rationale behind the design is that the PM probe is expected to have a larger value than that of the MM probe, there are many exceptions in actual data. We provide a mathematical explanation for this inconsistency based on the functional states of a gene-�ON/OFF�- where both PM and MM values are assumed to have the same distribution when a gene is in the OFF state. This means that the probability that MM > PM is equal to that of MM < PM for OFF genes. The probability of a gene being ON or OFF is able to be estimated using the number of times MM > PM per probe set. This is useful to discriminate specific bindings from non- specific bindings. Our assumption on GeneChip probe level data is validated by inter-platform comparisons using common targets among three different platforms.
Affymetrix GeneChip, binomial distribution, mathematical modeling, oligonucleotide microarray, ON/OFF hypothesis for gene expression.