Skip to Content

Thomas C. Skalak

Thomas C. Skalak

Contact Us

P.O. Box 400301
Charlottesville, VA 22904


National Research Council: Data-Based Assessment of Research Doctorate Programs

Sample Rating Calculation

[updated August 2010]

Please note -- rather than combining the coefficients derived from the direct- and regression-based weights (as described below), the NRC now will report the direct-based (survey or S-based) ranking and the regression-based (R-based) ranking separately. In addition, the NRC will release rankings for three "dimensional" measures: (1) research activity, (2) student support and outcomes, and (3) diversity of the academic environment. Moreover, the ranges of rankings will be reported at a 90 percent confidence interval rather than the interquartile range (i.e. 50 percent confidence interval). The NRC will release a revised methodology guide, upon the release of the rankings, on Sept, 28, 2010.

[original text from June 2009]

The NRC has provided a sample rating calculation to give institutions some sense of what they will see upon release of the assessment. 

The NRC identified 20 key variables that, in their judgment, are indicative of the quality of a Ph.D. program.  Values for these variables were obtained from the questionnaires and existing data sources. Two exercises were conducted to calculate the relative importance (i.e. weight) for each variable.  First, participating faculty were asked to select the variables that contribute most to program quality.  Second, a sample of faculty was asked to rate a sample of programs in their field.  The NRC then used statistical methods (i.e. a regression model) to relate these reputational ratings to the 20 key variables.  The results of the two exercises were summed to calculate a weight for each of the 20 key variables that is specific to each discipline

1a.           Publications per faculty member (non-humanities)
1b.           Published books and articles per faculty member (humanities)
2.             Average citations per publication (non-humanities)
3.             Percent of faculty with grants
4.             Percent interdisciplinary
5.             Percent of non-Asian minority faculty
6.             Percent of female faculty
7.             Honors & awards per faculty member
8.             Average GRE score (humanities – verbal; non-humanities – quantitative)
9.             Percent of students with full support in first year of study
10.           Percent of first-year students with external funding
11.           Percent of non-Asian minority students
12.           Percent of female students
13.           Percent of international students
14.           Average annual PhDs graduated
15.           Average percent of a cohort completing in eight years (humanities), six years (non-humanities)
16.           Median time-to-degree for full- and part-time students
17.           Percent of PhDs with definite plans for an academic position
18.           Individual workspace for students
19.           Provision of health insurance
20.           Provision of student support services


The example below, consisting of three charts, is for an anonymous program in economics. 

[The following text is excerpted from chapter 5 of A Guide to the Methodology of the National Research Council Assessment of Research Doctorate Programs.  Individuals well versed in statistics may also wish to review “Appendix A: A Technical Discussion of the Process of Rating and Ranking Programs in a Field.”]

Shortly before the assessment is released, each institutional coordinator will receive three tables for each program that was ranked. These will reflect the following: (1) the values that they submitted or were calculated from their data for each of the 20 variables with their corresponding standardized values, and (2) a pair of combined coefficients (plus and minus one standard deviation from the average value) used in weighting the variables (see Table 5-1 below); and (3) the standardized program values and the actual combined coefficients that were used to calculate the rating corresponding to each endpoint of the inter-quartile range of rankings for that program, as well as the program ranking corresponding to those ratings (see Tables 5-2a and 5-2b below). Examples of these tables for an economics program are presented and discussed below.

Table 5-1 shows the values submitted by an unidentified program in economics and the range of combined coefficients for the entire field. Columns 1 and 2 name and label the variables. Column 3 gives the program value for each of the 20 variables used in the overall rating (see Appendix E for a description of these variables). Column 4 presents the standardized value of each variable in column 3; scores are standardized across all programs in the field, using a mean of 0 and variance of 1. Thus, the relative strengths and weaknesses of a program (in terms of these 20 variables) can be seen by comparing the standardized values in column 4.

Columns 5 and 6 give the pairs of combined coefficients (weights) assigned to each variable used in rating all economics programs.1 Each coefficient is a combination of both the direct and regression-based weights, the derivation of which is described in detail in Appendix A. In economics, variables V1, V2, and V14 (publications per allocated faculty, cites per publication and average number of Ph.D.’s) were assigned the largest weights.  Although it would be relatively easy to calculate a single rating for the program using the data in Table 5-1, the result could be misleading, because it would not reflect the variability (i.e., uncertainty) in each of the program measures or the variability in the estimation of the weights.

The process for taking into account these sources of variability is described in detail in Appendix A.

Table 5-1: Data and Coefficient Table for a Program in Economics

Tables 5-2a and 5-2b show the calculations of the first and third quartile rankings, respectively, for a particular program.2 First, a randomly sampled set of regression coefficients and direct weights is used to obtain a set of 20 combined weights (column 5). These weights are multiplied by a sampled set of standardized program values (column 4) to generate a program rating (sum of column 6). This process is repeated another 499 times, generating 500 ratings for each of the 117 economics programs. Each of these 500 ratings for the program is ranked by comparing it with the ratings for the other 116 economics programs, based on the same selection of weights. The 500 rankings for the program are then ordered from best to worst, with the 125th being the Quartile 3 ranking (45) and the 375th being the Quartile 1 ranking (56). These values determine the inter-quartile range of rankings for the program. Half of the 500 randomly generated rankings for the program fall within this range. The ratings that produced these first and third quartile rankings are -0.054 and 0.085, as shown in Tables 5-2a and 5-2b.3

Table 5-2a: Sample First Quartile Ranking Calculation

Table 5-2b: Sample Third Quartile Ranking Calculation

In interpreting the range of rankings a program received, the first thing to note is which variables have the highest coefficients. These variables can be determined by examining the combined coefficients and identifying the largest ones. In the case of economics, the important variables are citations per publication, publications per allocated faculty, average Ph.D.’s in 2002-2006, and average GRE-Q, each of which has a combined coefficient value of 0.089 or greater. The rest of the variables are less heavily weighted, and a number of the variables don’t enter into the determination of the overall rating at all because their coefficients were not statistically different from 0.4

1. Five hundred regressions are run using half of the raters each time and 500 draws are made from randomly selected halves of the pool of direct ratings in order to construct the combined coefficients. The values presented show the range encompassed by plus or minus one standard deviation for each coefficient. See Appendix A for details. The first quartile ranking is the highest value of the lowest quarter of rankings. The third quartile ranking is the highest value of the third quarter of rankings.

2. Use of the inter-quartile range means that we “throw away” half of the possible rankings for the program. The tails of the distribution can be very long, however, and the inter-quartile range is useful in making meaningful comparisons, while illustrating the point that any point estimate of a ranking is inexact.

3. We do not show the 117 x 500 matrix of all the ordered ratings for all the economics programs, although it will be available when the final report is released. However, the ranking is obtained from that table.

4. The procedure for setting nonsignificant coefficients to 0 is discussed in Appendix A.