The Hierarchical Cluster Analysis (HCA) and Principal Component Analysis (PCA) are powerful data exploring tools extracted from ArrayTrack™ – a microarray database, data analysis, and interpretation tool developed by NCTR. ArrayTrack™ is MIAME (Minimum Information About A Microarray Experiment)-supportive for storing both microarray data and experiment parameters associated with a pharmacogenomics or toxicogenomics study. The primary emphasis of ArrayTrack™ is the direct linking of analysis results with functional information to facilitate the interaction between the choice of analysis methods and the biological relevance of analysis results.
Hierarchical Cluster Analysis (HCA) – two-way
HCA allows you to investigate the grouping of samples by their similarities in gene expression (or any data elements) profiles and by their similarity of samples. The primary purpose of the two-way HCA analysis is to present data so that genes (or any data elements) with a similar expression level across the samples are clustered along one axis while the samples with similar gene expression patterns are grouped together along another axis. Since the genes in the same cluster are likely to share similar functions, this analysis could reveal the relationships of molecular functions (genes) and phenotypes (samples).
Principal Component Analysis (PCA)
PCA generates the linear combination of the genes (or any data elements), namely principal components, using a mathematical transformation. The algorithm ensures that the first principal component explains the maximal amount of variance of the data. The second principal component explains the maximal remaining variance in the data subject to being orthogonal to the first principal component, and so on, such that all principal components taken together explain all the variance of the original data. The PCA plot of the first three principal components, which usually explains the majority of variance in the data, is a powerful data-exploring tool. PCA standalone tool offers both 2D and 3D views of the PCA results, along with the loading tables.
- Windows Operating System with installed Java 6 or higher.
- Two batch script files ArrayTrack_HCA.bat, ArrayTrack_PCA.bat and folder "library" are placed under the same directory.
Please contact NCTRBioinformaticsSupport@fda.hhs.gov if you need assistance with accessibility of any documents contained inside the zip file.
(please cite the articles below if you find the tools useful)
- ArrayTrack -- Supporting Toxicogenomic Research at the U.S. Food Drug Administration National Center for Toxicological Research.
Tong W., Cao X., Harris S., Sun H., et al.
Environ Health Perspect. 2003, 111 (15): 1819-1826. 10.1289/ehp.6497.
- ArrayTrack: An FDA and Public Genomic Tool.
Fang H., Harris S.C., Su Z., Chen M., et al.
Methods Mol Biol. 2009, 563 (3): 379-398.
- ArrayTrack: a Free FDA Bioinformatics Tool to Support Emerging Biomedical Research--An Update.
Xu J., Kelly R., Fang H., and Tong W.
Human Genomics. 2010 Aug 1;4(6):428.
|Questions, Suggestions, or Get Assistance:||NCTRBioinformaticsSupport@fda.hhs.gov|
Resources for You
NCTR Bioinformatics Support
Food and Drug Administration
3900 NCTR Rd
Jefferson, AR 72079