last modified: 03-FEB-1999 | catalog | categories | new | search |

ESTS0763 DENDRO.

DENDRO, Cluster Analysis of Experimental Data with Tree Plot

top ]
1. NAME OR DESIGNATION OF PROGRAM:  DENDRO.
top ]
2. COMPUTERS
To submit a request, click below on the link of the version you wish to order. Only liaison officers are authorised to submit online requests. Rules for requesters are available here.
Program name Package id Status Status date
DENDRO ESTS0763/01 Arrived 03-FEB-1999

Machines used:

Package ID Orig. computer Test computer
ESTS0763/01 IBM 360 series
top ]
3. DESCRIPTION OF PROBLEM OR FUNCTION

DENDRO performs hierarchical cluster analysis of experimental data with options for data norm- alization, metrics, and clustering criteria. A dendrogram, or tree plot, can be generated on a line printer or plotter illustrating which clusters are combined at what distance.
top ]
4. METHOD OF SOLUTION

Any normalization is done first. Three types of normalization are available - one is based entirely on the values of the input data, another based on scaled values resulting from user-defined weighting, and a third based on class stratific- ation. Next distances between objects are determined according to the metric chosen. Ten choices of metric are provided including Euclidean, Pearson and Spearman correlation, and the Tukey-Fisher distance metric. Then, agglomerative hierarchical cluster analysis is performed according to the clustering criterion selected, which may be single linkage, complete linkage, group or weighted average, centroid, median, Ward's method, or variance.
top ]
5. RESTRICTIONS ON THE COMPLEXITY OF THE PROBLEM

A pairwise distance  matrix is stored. This matrix requires N(N-1)/2 (where N is the number of objects) storage locations.
top ]
6. TYPICAL RUNNING TIME

Reasonable size dendrograms can be produced in less than 3 minutes. NESC compilation of the DENDRO source and execution of the sample problem required 10 CPU seconds on an IBM360/195.
top ]
7. UNUSUAL FEATURES OF THE PROGRAM

The program is designed to provide the user flexibility in selecting normalization options and clustering criteria.
top ]
8. RELATED AND AUXILIARY PROGRAMS

Related programs include RECOG- ORNL, SPEKEN (NESC Abstract 781), and PCTEST (NESC Abstract 769).
top ]
9. STATUS
Package ID Status date Status
ESTS0763/01 03-FEB-1999 Masterfiled Arrived
top ]
10. REFERENCES

C.L. Begovich and N.M. Larson,
A User's Manual for the Pattern Recognition Code RECOG-ORNL,
ORNL/CDC/TM-21, May 1977.
top ]
11. MACHINE REQUIREMENTS

Storage required depends on problem size: small problems with 100 to 200 objects can be executed in 260K bytes of memory.
top ]
12. PROGRAMMING LANGUAGE(S) USED
Package ID Computer language
ESTS0763/01 FORTRAN
top ]
13. OPERATING SYSTEM UNDER WHICH PROGRAM IS EXECUTED:  OS/360,370.
top ]
14. OTHER PROGRAMMING OR OPERATING INFORMATION OR RESTRICTIONS

Plots
are produced using the proprietary Integrated Software Systems Corporation DISSPLA software. The proprietary International Mathem-  atical and Statistical Libraries (IMSL) routine, MDNRIS, for the inverse standard normal (Gaussian) probability distribution function is used,  also.
top ]
15. NAME AND ESTABLISHMENT OF AUTHOR

             N.M. Larson and C.L. Begovich*
             Oak Ridge National Laboratory
             P.O. Box X
             Oak Ridge, Tennessee 37830
* Contact
top ]
16. MATERIAL AVAILABLE
ESTS0763/01
source program   mag tapeDENDRO.FILE1 Source code                   SRCTP
source program   mag tapeDENDRO.FILE2 Source code                   SRCTP
source program   mag tapeDENDRO.FILE3 Source code                   SRCTP
source program   mag tapeDENDRO.FILE4 Source code                   SRCTP
top ]
17. CATEGORIES
  • P. General Mathematical and Computing System Routines

Keywords: data analysis, experimental data, metrics, pattern recognition, statistics.