NAG Logo
Numerical Algorithms Group
graphic

A Further Example of the NAG Statistical Add-Ins for Excel - Cluster Analysis

The NAG Statistical Add-Ins give you a range of statistical functions to help you make the most out of your data within Excel.  The example below illustrates how to perform a hierarchical cluster on some multivariate data. The data used refers to various details of cars for sale in 1997 where the individual variables refer to:

Variable Name Description
car! Name of car
price Price in £ sterling
power Engine horse power
ncylind! Number of cylinders in car engine
max_mph Maximum speed in
miles per hour
zero_60 Time in seconds to accelerate from 0 to 60 mph.
Ins_gp! Insurance grouping of vehicle
ave_mpg Average number of miles per gallon of fuel

At any time you can view the screenshots at full size by clicking on the image.

The first thing to do is to place the cursor in an empty cell where we wish the output of the function to be displayed. For this example we created a new worksheet for the results and placed the cursor in cell B1. Next we select the required NAG function. As we wish to produce a cluster analysis of our data, after clicking on the function wizard button (highlighted in red), we either select the All or Statistical function category and then scroll down to select the CLUSTER function from the Function name: list.


Figure 1: Selecting the function (click image to view)

Once we have selected the required function, the specific function wizard will appear. The CLUSTER function wizard has a range of fieldboxes that require information to be supplied before the analysis can be performed. Fieldboxes whose names appear in bold text, e.g. Method, indicate that these parameters are compulsory as opposed to optional i.e., they must be supplied before the analysis can be carried out. For this example, only the Method parameter is compulsory.

If, for any NAG function, you are unsure as to the choices available for each parameter, online help is available. To obtain this help, click on the question mark at the bottom left hand corner of the CLUSTER function wizard (highlighted in red), and ask for help on the specific function. The NAG help will then be displayed.


Figure 2: Asking for help (click image to view)

We must now tell Excel which data we wish to be analysed. From above however, we noted that only the Method parameter was compulsory. This is because the data for the analysis can be supplied as either a distance matrix or as a matrix of raw data. For this example we have opted for the complete link method with squared Euclidean distances to be applied to the matrix of raw data, contained in cells B2:H34 of our data worksheet. To supply the names of our data objects (car names), we have to scroll down the list of available parameters (Figure 3b, scrollbar highlighted in red)


Figure 3a: Selecting the data (click image to view)

Figure 3a: Naming the objects (click image to view)

Returning to our results worksheet, the <List of 237> message will appear in cell B1, indicating that the calculations have been successfully completed. Now all we need to do is to display the results themselves.

To obtain the results of an Excel function, follow the following steps:

  1. Place the cursor in the original output cell.
  2. Holding down the left mouse button, drag the cursor down until the number of cells stated in the <list of ...> message have been highlighted. In this case we have to highlight 237 cells in order to display all the function output.
  3. Place the cursor in the Formula Bar and, whilst holding down the Ctrl and Shift keys, press the Enter key.

Figure 4: Displaying the results (click image to view)

The function output will now be displayed in cells B1 to B237 of our output worksheet.


:
:

Figure 5: Output from the cluster analysis (click image to view)

As the cluster output stands, all of the information appears in a single column that may be difficult to read. To help you reorder the function output, the NAG Add-Ins come complete with a utility called NAGExtract which will take the output from a NAG function and rearrange it for you in a legible format, in the destination of your choosing.

To do this for our cluster analysis example, we first of all select NAGExtract from the NAG menu.


Figure 6: The NAGExtract utility (click image to view)

As with a NAG function, we have to provide the location of the data we wish to work on. For our example, the cluster output is listed in cells B1 to B237. We first of all select where we wish the reordered output to be displayed, in this case, cell C1 and then place the cursor in the Location of raw NAG results fieldbox before highlighting cells B1 to B237. Once we have selected the raw output, click on the Search button (highlighted in red) to instruct NAGExtract to display details of the available output.


Figure 7: Using the NAGExtract utility (click image to view)

For our example, NAGExtract informs us that the NAG output that can be rearranged includes details of the cluster steps which will range over 4 variables, each of length 33, and the information required to construct a dendrogram which ranges over 3 variables each of length 34. For our example, we simply wish to look at the clustering information so we select this option and click on the Apply button (highlighted in red).


Figure 8: Available NAG output (click image to view)

The ordered output, consisting of the allocation of the cars to the clusters at each step and the associated distances, are displayed in cells C1 to F33. We would repeat this process for any additional sections of NAG output that we would like to reorder.


Figure 9: Reordered cluster output (click image to view)

All of the NAG Statistical Add-Ins for Excel operate in the same manner; select the function, access the function help if necessary, provide appropriate parameters for the function wizard fieldboxes, highlight the required number of output cells, press Ctrl+Shift+Enter to display the output and then use the NAGExtract utility to reorder the output of interest. By following these steps, many simple and complex statistical analyses alike can readily be performed, helping you make the most of your data in an environment statistical newsletter

© The Numerical Algorithms Group 2008
Privacy Policy | Trademarks

© Numerical Algorithms Group

Visit NAG on the web at:

www.nag.co.uk (Europe and ROW)
www.nag.com (North America)
www.nag-j.co.jp (Japan)

http://www.nag.co.uk/stats/ae/examples/ae_example_cluster.asp