The Hamilton Rating Scale for Depression (Ham-D)1 is a commonly used instrument in depression assessment and research. In clinical practice, this instrument is of great value to assess the severity of the illness in patients suffering from depression without somatic comorbidity.
Recently, we evaluated the validity of the Ham-D in patients suffering from Parkinson's disease (PD).2 The validity of the Ham-D has not yet been established in patients with depressive symptoms and stroke, although depression rating scales are frequently used in all kinds of research into the prevalence and course of poststroke depression. There are studies in which the validity of the Ham-D has been established in patients with Alzheimer's disease (AD).3 Other investigators have used the Ham-D to assess the severity of depressive symptoms in stroke and AD, irrespective of the presence of major depression. It has also been used as a diagnostic scale to assess the presence of major depression in stroke and AD patients, and to dichotomize study populations into depressed and nondepressed patients, sometimes for the purpose of excluding patients from clinical trials.4 Such use of the Ham-D is justified only if there is high concurrent validity with the criteria of the DSM-IV for depressive disorders.5 In addition, the Ham-D score should have high positive and negative predictive value for major depression in these disorders.
In our recent study of patients with PD, the optimal cutoff score for diagnostic purposes was 16/17, meaning that a score of 16 or less indicates the absence of depression and a score of 17 or higher is indicative of the presence of major depression; for screening purposes, the cutoff was 11/12.2 The mean score of nondepressed PD patients was 8.4, which indicates that there might be a disease-specific threshold in this group of patients. In patients with AD, the optimal cutoff score was found at 7/8 for diagnostic purposes.3
In order to assess whether disease-specific cutoff scores should be applied for the Ham-D in various neurological disorders, in the present study we compared the concurrent validity of the Ham-D in relation to DSM-IV criteria for major depressive disorder in patients with stroke, AD, and PD. Furthermore, we compared the psychometric performance of the Ham-D as a screening and a diagnostic instrument in these three neurological disorders.
Data on the Ham-D scores from patients of three different study groups were compared. These groups consisted of patients with stroke, Alzheimer's dementia, and Parkinson's disease.
First, Ham-D results of 44 stroke patients who were included in the Dutch Vascular Factors in Dementia Study were analyzed. All were inpatients on a neurological ward. Details of the design and characteristics of the study are outlined elsewhere.6 Patients with transient ischemic attack, cerebral infarction, or intracerebral hemorrhage were included in the study. Patients with impairment of consciousness or severe aphasia that could preclude valid judgment of cognitive or affective disturbances were excluded. When possible, the demented patients and a random sample of nondemented patients were subjected to a psychiatric evaluation. Only these patients, for whom results on the Ham-D were available, were included in the present analysis. An indication of cognitive function was achieved by using the Mini-Mental State Examination (MMSE).7 Cognitive function was further assessed between 3 and 9 months after stroke. The diagnosis of dementia was based on the results of an extensive neuropsychological examination, clinical presentation, and information from a close relative. A diagnostic panel consisting of two neurologists, a neuropsychologist, and a trained physician made a final judgment. For the diagnosis of dementia, the criteria of the DSM-III-R8 were used. Psychiatric examination was also performed between 3 and 9 months after the stroke. All evaluations were performed in the afternoon to prevent results being influenced by diurnal variations in mood. A DSM-IV diagnosis of major depressive disorder was made by using the Schedule for Affective Disorders and Schizophrenia (SADS).9 This semistructured interview yielded the "gold standard" diagnosis of major depression in this group. In addition, all patients, whether depressed or not, underwent a semistructured interview in which the Ham-D was administered.
The second study group comprised 85 patients with PD. They were consecutive referrals to the movement disorder clinic of Maastricht University Hospital. They all met the criteria for PD as defined by the United Kingdom PD Society Brain Bank.10 Cognitive function was assessed by the MMSE. Physical disability was rated according to the Hoehn and Yahr staging system.11 Every patient underwent a protocolized mental status examination. A DSM-IV diagnosis of major depressive disorder was made in a structured interview with the aid of the Structured Clinical Interview for DSM-III-R (SCID-D).12 This diagnosis was considered the "gold standard" for major depression in this population. All the patients, depressed and nondepressed, underwent a semistructured interview to obtain a score on the Ham-D. The PD patients who met the DSM-IV criteria for dementia were excluded from further analysis.
The third group comprised 274 patients with AD. These were consecutive referrals to the Maastricht Memory Clinic of Maastricht University Hospital. Dementia was diagnosed according to DSM-IV criteria. Alzheimer's disease was diagnosed according to the criteria of the National Institute of Neurological and Communicative Disorders and Stroke/Alzheimer's Disease and Related Disorders Association (NINCDS-ADRDA).13 The Ham-D was used as a symptom checklist to diagnose major depression according to the DSM-IV criteria. The cognitive status of 239 of the patients was also assessed by using the MMSE.
In order to determine and compare the sensitivity and specificity of the Ham-D as a diagnostic instrument to diagnose major depression in stroke, AD, and PD patients and to obtain optimal cutoff scores, we plotted receiver operating characteristic curves (ROC curves) for the Ham-D scores from all three disorders.14 These curves yielded the sensitivity versus 1 minus the specificity for every possible cutoff point. The optimal cutoff point was determined visually by assessing which score combined maximum sensitivity with optimal specificity. The area under the curve (AUC) is an indicator of the ability of the scale to distinguish between depressed and nondepressed patients. Optimal cutoff scores were determined for each group of patients, including cutoff scores that could be used for purposes of screening, diagnosis, and dichotomization (i.e., discriminating depressed from nondepressed patients). In order to determine whether the Ham-D could be used as a predictive test for these three groups of patients, positive predictive values (PPV) and negative predictive values (NPV) were calculated for different cutoff scores in the central range of the scale. The three ROC curves were compared in pairs with the Hosmer-Lemeshow goodness-of-fit test.15 All analyses were performed with the Stata software package, release 5 (StataCorp 1997, College Station, TX, USA).
The average age in the PD group was lower than that in the other two groups: stroke patients 70.3±16.2 years, AD patients 71.1±8.8 years, and PD patients 67.3±10.2 years (means and standard deviations are reported). Although it reached statistical significance, this difference in age does not, in our opinion, reflect any clinical relevance. The group of AD patients contained significantly more women than the other two groups: stroke patients 36.4%, AD patients 59.5%, and PD patients 40.0% (χ2=15.278, df=2, P<0.001). Mean scores on the MMSE for stroke patients, AD patients, and PD patients were 19.18±6.87 (range 9—30), 18.96±5.80 (range 0—29), and 27.76±1.82 (range 23—30). Twenty-three of the stroke patients fulfilled criteria for dementia (52%). The PD patients had the following Hoehn and Yahr classifications: 8 stage I, 56 stage II, 17 stage III, 4 stage IV, and none stage V.
The prevalence of major depressive disorder was highest in the stroke group: 34.1%. In the AD group the prevalence was 22.6%, and in the PD group it was 23.5%. This difference did not reach statistical significance (χ2=2.743, df=2, P=0.254). Mean Ham-D scores, ranges, and standard deviations are listed in t1 for the depressed and the nondepressed subgroups in each disorder. Frequency distribution analysis showed that there were no missing scores in the central score range of the Ham-D in the three groups of patients.
Sensitivity, specificity, PPVs, and NPVs for different cutoff scores of the Ham-D are shown in t2 for the patients with each disorder. ROC curves for the three groups of patients are shown in F1.
The cutoff score for maximum discrimination (dichotomization) between depressed and nondepressed patients could be determined visually from the ROC curves as the point at which the highest sum of sensitivity and specificity was reached. In the stroke group, this point was reached at a cutoff score of 5/6 (sensitivity 1.00, specificity 0.93); in the AD group, at 9/10 (sensitivity 0.86, specificity 0.84); and in the PD group, at 12/13 (sensitivity 0.80, specificity 0.92). The AUC was large in all three groups of patients, which indicates that the Ham-D score has high concurrent validity with the DSM-IV criteria for major depressive disorder in these diseases.
To be useful for diagnostic purposes, an instrument needs a combination of high specificity and high PPV, which yields higher cutoff points than those for dichotomization. Once again there were differences between the three groups: optimal cutoff for diagnostic purposes was found in the stroke group at 10/11 (specificity 1.00, PPV 1.00); in the AD group, at 13/14 (specificity 0.96, PPV 0.76); and in the PD group, at 15/16 (specificity 0.99, PPV 0.93).
To be useful for screening purposes, an instrument needs a combination of high sensitivity and high NPV. These requirements were met in all three groups of patients at low cutoff scores. This cutoff point was lowest in the stroke group: 5/6 (sensitivity 1.00, NPV 1.00). In the AD group, the optimal cutoff point was found at 6/7 (sensitivity 1.00, NPV 1.00), and in the PD group, at 9/10 (sensitivity 0.95, NPV 0.98). Further goodness-of-fit analyses of the ROC curves revealed statistically significant differences between the three groups.
In this study we showed that the concurrent validity of the Ham-D for DSM-IV major depression in stroke, AD, and PD was high. However, optimal performance requires that different cutoff points be taken into consideration for each organic disorder.
The prevalence of major depression in our three groups was comparable with that reported by others.16—18 In agreement with our previous study on patients with PD,2 in the present study the nondepressed patients with stroke and AD had higher scores than nondepressed "normal" individuals, and the PD patients in our analyses had the highest scores.
At low disease-specific cutoff points, the Ham-D can be used as a screening tool in all three disorders. However, other, easier to administer self-report questionnaires may be more practical for this purpose because they do not require trained personnel. At higher cutoff scores, the Ham-D also proved to be useful as a diagnostic tool, with an optimum cutoff point of 10/11 for stroke, 13/14 for AD, and 15/16 for PD. There was no significant difference in Ham-D performance between the three organic disorders when the disease-specific cutoff scores were used. As we described in the Methods section, the stroke group consisted of "cases" with dementia and nondemented "controls." Therefore dementia was present in half of these stroke patients, whereas in stroke populations, the prevalence of dementia has been reported to be approximately 25%.6,19
Nevertheless, our findings indicate that different organic disorders are associated with different profiles of clinical mood syndromes. This is important, because the previously established cutoff scores for "healthy individuals" did not apply to our three groups of patients. When, for example, a clinical trial on the efficacy of antidepressants is conducted on patients with these disorders, adjusted cutoffs should be taken into account when selecting endpoints to determine improvement during or after therapy. It is probably because of the somatic and psychomotor items that the floor score of the Ham-D is higher in depressed patients with an organic disorder than in those without.
A criticism could be made that we used data from three different studies and obtained psychiatric diagnosis in a different way in each of these studies. Although this is true, we think that applying DSM-IV criteria to all three groups in a uniform way has minimized the influence of this methodological shortcoming. Next, some concerns could be expressed about the problem of diagnosing depression in a group of patients with dementia. Others have recently shown that depressive symptoms can be reliably scored with the Ham-D in patients with AD and that the symptoms are significantly related to an underlying depressed mood.20 Another problem in studies such as this is that there is no solid "gold standard" for psychiatric disorders. The psychomotor and autonomic symptoms that accompany organic brain disease may coincide with DSM criteria for major depression and thus make a diagnosis of major depression more likely in these patients; however, until now there has been no suitable alternative for a gold-standard diagnosis of depression. In theory it is possible for patients with an organic brain disorder to have a high score on the Ham-D without showing any of the core symptoms of major depression, but the high concurrent validity of the Ham-D with the DSM-IV criteria for major depression showed that although this is theoretically possible, it is not a major clinical concern. A final concern we would suggest is that our findings have to be validated with the results of other cohorts of patients with stroke, AD, and PD.
The concurrent validity of the Ham-D with the DSM-IV diagnosis of major depressive disorder is high for patients with stroke, AD, and PD. However, optimal performance requires the use of disease-specific cutoff points for screening, diagnostic, and dichotomization purposes. Both in clinical practice and in research design, it is essential to take these disease-specific qualities of the Ham-D into account. These disease-specific qualities should be established in other and preferably larger cohorts.