Laboratory tests are an essential part of the practice of modern medicine. Laboratory tests can be used to confirm a diagnosis, provide supportive evidence for one diagnosis versus another, or rule out a specific disorder. The last 30 years of biological research into the pathophysiology of psychiatric disorders have yielded a number of highly replicable abnormalities. These abnormalities have the potential for being developed into clinically useful diagnostic tests. While psychiatrists do use lab tests to rule out general medical conditions as causes for mental disorders, there is no tradition for using laboratory tests in differentiating among primary psychiatric disorders. As a field, psychiatry has lagged behind in developing lab tests according to well-defined epidemiological principles.
Laboratory tests in psychiatry tend to either not be developed into diagnostic tools (e.g., P300 evoked response in schizophrenia) or to be disseminated before their validity is fully documented (e.g., Quantified electroencephalogram [EEG]). The premature release of such tests could lead to disappointment of the medical community and premature abandonment of the test. Moreover, when tests are used out of context, they may hinder the diagnostic and treatment process and increase the cost of management unnecessarily.1
The development of ancillary diagnostic procedures is important to help the field move forward as diagnosis in psychiatry remains the major limiting step in biological research and treatment studies.2 In order to promote a standard approach, we have recently proposed a four-step process for developing laboratory-based diagnostic tests for use in aiding the diagnostic process in psychiatry.3 The four-step approach proposed is based on the guidelines for deciding the clinical usefulness of diagnostic tests published by Sackett et al.4 and the more recently published criteria specified by the Standard for Reporting Diagnostic tests (STARD).5,6
Step 1, a biological variable is observed to be deviant from healthy comparison subjects in a particular patient population. The demonstration of test-retest reliability of the finding using blinding procedures is an essential component of this early step. Replication of the finding by the same or collaborating groups is important, but confirmation by independent groups is essential for this particular test to move into the next step of development.
Step 2 is the demonstration of potential clinical usefulness of the specific finding. The two most important objectives at this step are demonstration of difference between the target patient population and appropriate comparison groups (these should be groups of patients with diagnoses that commonly appear on the differential diagnostic lists of the target disorder). This is an important point as a biological abnormality may be common to two disorders that hardly ever appear on the same differential diagnostic list (e.g., schizophrenia and dementia in a young adult). While such a finding would be of considerable scientific interest, it would not particularly decrease the diagnostic potential of the finding. On the other hand, an abnormality that is equally common to disorders that frequently need to be differentiated from one another (e.g., Bipolar Disorder and Schizophrenia) is not likely to be useful clinically. Abnormalities with significant differential prevalence among disorders to be differentiated are likely to be able to significantly contribute to the differential diagnostic process and should progress to Step 3. Estimation of the effect size of the finding could be a reasonable guide to which findings should be considered good candidates for Step 3 studies.
During Step 3 the performance characteristics of the test should be established. Specifically, the sensitivity, specificity, positive and negative predictive values of the biological marker should be examined. These data should allow the estimation of the added diagnostic value resulting from incorporating the test into the work-up of a particular patient. The choice of the "gold standard" or reference test is an essential component of this step. This is the standard against which the test being developed will be measured. The currently accepted gold standard in psychiatric diagnosis is the "Best Estimate Diagnosis."7 Best Estimate Diagnosis is the agreement among a number of experts and with a standardized scale with demonstrated validity and reliability. At this step, the clinical characteristics of the patient group identified by the test are usually further delineated. Due to the heterogeneous nature of psychiatric disorders, it would be naive to expect any one biological test to be able to identify all patients who are classified into a certain DSM-based category (e.g., schizophrenia). It is much more likely that a particular test will be able to identify one or more subgroups within these categories. Defining the clinical characteristics of the subgroup that is identifiable by a particular test would be very important for the test to be considered for clinical use. Factors such as effects of illness duration and severity and the effects of medications should also be defined during Step 3. At this step, the test would be considered "promising" for development as a diagnostic test.8
Step 4 defines the clinical application of the test and helps standardize the technique used in large and multicenter clinical trials. Multicenter trials should pave the road toward standardization of laboratory procedures used to conduct the test as well as providing data regarding cost effectiveness and impact on both short-term and long-term clinical outcomes. Studies in earlier steps depend on smaller samples of comparison subjects that are usually locally formed. On the other hand, Step 4 studies should begin to develop larger normative databases that can eventually be used to examine an individual’s data. Development of such databases can be challenging and will require collaboration among research groups concerned with the specific test being developed. t1 summarizes the design, purposes, and desired outcomes of each of the four proposed steps.
The object of this study is to demonstrate the utility of the four-step approach. We chose one biological finding that has been replicated and appears to have promise for developing as a diagnostic tool (i.e., increased EEG theta activity in patients with attention deficit or attention deficit hyperactivity disorders) to assess whether published studies have fulfilled the eight guides provided by Sackett et al.4 and to ascertain whether or not this particular finding can be classified based on the proposed four-step approach. An additional goal of this study is to ascertain whether future research recommendations can be made based on the four-step approach that would help propel this test toward clinical utility.
ADHD is a prevalent disorder among children and adolescents as well as adults. ADHD is defined by DSM—IV as involving pervasive symptoms of inattention and/or hyperactivity-impulsivity, which are observed in 3%—5% of children prior to the age of 7.9 A number of psychiatric disorders share similar symptomatology. These disorders include learning disabilities, conduct disorders, and affective disorders. It is also possible that a behavioral syndrome similar to ADHD could result from the lack of discipline or from improper rearing practices in the home environment.10 As a result of such symptom overlap, it is possible that ADHD could be overdiagnosed. Much of the difficulty facing clinicians seeking to diagnose ADHD stems from the absence of any laboratory tests specific for attentional disorders. The continuous performance test (CPT) has been used to probe and diagnose ADHD.11 The CPT is an objective measure of attentional problems, and its scoring and interpretation are automated, thus avoiding evaluator biases. Nonetheless, CPT is susceptible to motivational and distracting factors.
The quantified EEG is a multichannel recording of the background EEG activity. Following visual inspection to remove artifacts, artifact-free tracings of 2—3 minutes are subjected to analysis using the fast Fourier transform (FFT) to quantify the power at each frequency of the EEG averaged across the entire sample. The EEG is composed of four classical frequency ranges: delta (0.5—3.5 Hz), theta (4—7.5 Hz), alpha (8—13 Hz), and beta (above 13 Hz). Based on the FFT the absolute power in each of these frequency ranges can be calculated (EEG spectral analysis). The relative power in each of the four spectral ranges can be calculated in relation to the total power in all four frequency ranges.
Evidence for EEG deviations in children with "minimal brain dysfunction" has been appearing in the literature over the last four decades. Abnormalities of the visually inspected clinical EEG (e.g., increased epileptiform activity or slow waves) were repeatedly observed and range in prevalence from 15%—30%.12,13 These rates of abnormalities as well as the varying types of abnormalities do not allow the use of the routine EEG as a diagnostic test for ADD or ADHD.
The rapid evolution of computer capabilities was associated with the appearance of a number of reports over the past 20 years providing evidence for EEG deviations in patients (predominantly children and adolescents) with ADD or ADHD.14 A number of EEG based abnormalities were described, including significant increase or decrease of the power in one of the EEG components (e.g., increased theta or decreased beta activity), and evoked response abnormalities (e.g., abnormal P300 event-related potentials). Among all the above mentioned deviations, abnormal EEG spectra (particularly increased theta activity) seems to be the most consistently reported EEG deviation in this group. Monastra et al.9 showed that EEG spectral analysis differentiated ADHD participants from nonclinical comparison groups at a significance level of p<0.001, with a reported sensitivity of 90% and specificity of 94%. More recently, the increased theta activity abnormality was also found in adult ADHD patients.15
Lubar14 reviewed the then existing literature regarding spectral EEG and evoked potential abnormalities reported in association with ADHD. He listed excessive theta activity as the most consistent finding in this group. In 1990 the same group examined ADHD patients without a confounding learning disability16 and reported the same finding of increased theta activity mainly in the frontal-temporal regions. This same finding was then independently replicated a number of times. While Defrance et al.17 confirmed the increased theta, they were not able to specifically point out a specific neuroanatomical locus for the abnormality. Ucles and Lorente found the increase in theta activity to be more prominent in the occipital region of a similar group of subjects.18 In this study they used a theta/alpha index to compare groups. They also found the theta/alpha index to be borderline abnormal in the frontal regions. Lazzaro et al.19 further independently confirmed the finding in 26 male combined-ADHD patients and again later in 54 similarly diagnosed subjects.20 It is not clear whether subjects from the earlier study were part of the sample reported in the later study. Abnormality was maximally observed in the anterior cerebral regions. Another independent replication appeared in 199921 where they examined the age effects on the reported spectral EEG abnormalities. They concluded that the theta excess is consistent and resistant to age effects while beta abnormalities tended to decrease with age. More recently, Clarke et al.22 further confirmed the abnormality in a group of 40 boys and 40 girls. They suggested that examination of the theta to beta ratio is a stronger measure of the abnormality. This ratio measure combines the abnormalities of the theta rhythm with abnormalities with beta activity (usually a decrease of beta that is less consistently reported) for a higher detection power. They nonetheless, show that the theta excess was statistically demonstrable without using the ratio measure (using absolute theta power)23 and further confirming their earlier similar finding.24 Based on the above, excess EEG theta seemed to be a candidate for developing as a diagnostic test for ADD/ADHD. On the other hand Loo et al.25 suggested that the presence of decreased (not increased) theta in the frontal region predicted a favorable response to methylphenidate therapy which tended to normalize the EEG abnormality. This study highlights the need for not only reporting the specific EEG deviation but also its topographical distribution.
We undertook this review in order to determine the status of the finding of increased theta activity in ADHD as per our proposed progression described above. Furthermore, we wanted to ascertain if our proposed four-step schema can be helpful in defining the stage of development of a particular finding as well as pointing out the research that remains necessary in order to establish clinical utility. Second, we performed the meta-analysis in order to provide an estimate of the strength and consistency (i.e., effect size) of the finding of increased theta activity in ADHD patients. Based on the effect size it may be possible to determine whether a particular finding may be suitable for developing as a diagnostic test.
We began with a search for all papers, published in the English language, that were cross referenced for EEG and ADD or ADHD. The search included MEDLINE, PsychInfo, and Current Contents. PsychInfo (including PsychARTICLES) yielded 119 citations. Current Contents yielded 85 citations and MEDLINE yielded 53 citations. All MEDLINE and most Current Contents citations were included among the PsychInfo citations. The first level of screening was based on study titles. This step was mainly for the exclusion of irrelevant topics and methodologies. Papers not examining the clinical entity of ADD/ADHD or using methodology other than quantified EEG (i.e., routine visual analysis of the EEG, evoked potentials, or polysomnography) were excluded. Papers addressing attentional issues in nonclinical populations were also excluded. Studies that did not specifically address ADD or ADHD but included subjects with "minimal brain dysfunction" were also excluded, as this category tended to include a more heterogeneous sample than just ADD or ADHD.
Abstracts of the remaining citations were then reviewed to determine the papers that specifically examined the spectral analysis of the EEG in ADD/ADHD populations. Papers were excluded for the following reasons: using quantified EEG to examine laterality deviations, examining EEG coherence abnormalities, study did not include an ADHD/ADD study group or did not include a healthy comparison group. The remaining studies were then reviewed by two of the authors (NB and AF) to define the articles that specifically examined the presence or absence and the degree of increased EEG theta activity during resting condition. This was an important exclusion as the activating procedures varied widely among studies. Studies that examined combined measures like relating the theta to other activities (theta to beta ratio) without specifically reporting theta activity were also excluded. This is also an important exclusion as the number of peer-reviewed reports using these compound measures is too small to allow a meta-analysis to be performed. All papers meeting all criteria were included in the review (t2). It should be noted that in all studies reported here, subjects were examined while off psychostimulants for at least 24 hours.
Studies were excluded from the meta-analysis if there was not sufficient information to calculate or estimate an effect size specifically for theta activity (e.g., the reporting of means without standard deviations or results of tests of statistical significance of the difference). Finally, in cases where results from the same sample were reported in different papers, duplicate reports were excluded. Seventeen papers met all our inclusion criteria and formed the bases for the review (t2) and 12 of these met the meta-analysis criteria (t2).
The 17 included studies were then reviewed for the criteria proposed by Sackett et al.4 as well as the four-step approach. t2 lists the 8 Sackett et al. criteria as well as their corresponding steps of the proposed four-step approach. Studies were assigned to a step based on the goals of the study. Studies aiming at demonstrating differences between patients and healthy comparison subjects were considered Step 1 studies. Studies incorporating appropriate patient comparison groups were considered Step 2 studies. Studies examining the performance characteristics of the test (and thus addressing its clinical utility) were classified as Step 3 studies. Finally, multicenter studies incorporating appropriate patient comparison groups are classified as Step 4 studies.
The effect sizes were also calculated in order to determine the strength of the finding. Effect sizes for the differences between clinical and comparison groups were expressed in the metric d, defined as the difference between the group means divided by the pooled within-group standard deviation.26 When the group difference was in the expected direction (i.e., the clinical group reported higher theta activity than comparison subjects), the d had a positive sign, and negative values were assigned to effect sizes in the "wrong" direction. Effect sizes were calculated directly from means and standard deviations when these descriptive statistics were reported. Otherwise, effect sizes were derived from t ratios, F values, or p values, as described in Hedges and Okin.26 If the authors reported only that the group difference in theta was nonsignificant, that study was assigned an effect size of 0.00. Some studies reported data from more than one appropriate clinical and/or group, and the data were pooled so that the effect size for each study always reflected the difference between a single comparison group and a single clinical group.
All effect sizes were based on measurements previously classified as reflecting absolute theta or relative theta. Thus, two effect sizes were computed for each study that measured both types of theta. Because of the assumption of independence of effects used in a meta-analysis, the mean of the two effect sizes from each of these studies was used in the primary meta-analysis.
The main method of meta-analysis was calculation of the weighted mean effect sizes using procedures outlined in Hedges and Okin,26 which give more weight to larger studies than to smaller studies. However, the unweighted mean effect sizes—which give equal weight to all studies—were also computed. A meta-analysis was conducted for each effect category. The primary effect category used all 12 studies in the analysis, with the effect sizes using relative theta for studies that examined only relative theta, absolute theta for studies using only absolute theta, and the average of the absolute theta and relative theta ds for studies that measured both types of theta. To determine whether the effect sizes varied by type of theta, separate meta-analyses were conducted using effect sizes for absolute theta and relative theta. (Studies that used both measures contributed to both meta-analyses.) Finally, to eliminate confounding of the differences between absolute and relative theta effects with differences among studies, meta-analyses of relative and absolute theta effect sizes were also conducted separately in the subsample of studies that yielded effect sizes for both theta measures.
Table 2 lists the 17 studies included in this review and which of the eight criteria proposed by Sackett et al.4 were met. The table also shows which of the four steps each of the eight Sackett criteria corresponds with.
Among the 17 studies included, there is clear evidence of scatter along the four-step continuum proposed. We found one multicenter study and it qualifies for a Step 4 study.10 Five studies addressed the specific clinical utility of the test, thus qualifying for Step 3. Two of these studies did not fulfill criteria for Step 2.15,16 The 11 remaining studies were classified in Step 1. None of the included studies provided test-retest reliability or interrater reliability data, a criterion proposed for Step 1. Studies meeting criteria for Step 2 tended to also meet criteria for Step 3.
t3 lists the studies included in the meta-analysis, along with the effect sizes for differences between clinical and comparison groups in absolute and relative theta activity. The primary meta-analysis—that used all 12 studies—confirmed the prediction that the clinical groups would exhibit higher levels of theta activity than comparison subjects, with a weighted mean effect size of 0.68. For absolute theta (using only the subsample of 10 studies that measured it), the weighted mean effect size was 0.59. For relative theta (using only the subsample of 11 studies that measured it), the weighted mean effect size was 0.91. Finally, in the meta-analysis that used only the nine studies that examined both types of theta activity, the weighted mean effect sizes for absolute and relative theta were 0.70 and 1.07, respectively. The unweighted mean effect sizes were virtually identical to their respective weighted results.
Two main conclusions can be reached based on the above data. First, the proposed four-Step schema was able to classify a representative biological finding into a specific step of development as a diagnostic tool. Second, based on a meta-analysis, we conclude that this particular finding is promising and should be further developed as a diagnostic test.
Based on our categorization schema outlined above, increased EEG theta remains a highly promising finding (i.e., Step 2). The lack of well-characterized test-retest and intrarater reliability data are a serious deficiency and should be remedied prior to planning further Step 3 or 4 studies. Similarly, the lack of blinding procedures is a serious problem. EEG analysis is open for investigator’s biases particularly during the process of deartifacting the record (removing artifacts to prepare record for analysis) and choosing EEG epochs (segments) to be entered into the analysis.
The literature reflects the lack of a general guiding system that could lead to the timely development of promising biological findings (like increased theta in ADD/ADHD) into useful and standardized diagnostic aids. For example, Chabot et al.28 developed a discriminant function based on EEG spectral deviations. Excessive theta activity centrally was one of the more powerful contributors to the discriminant. They reported being able to identify 76.1% of healthy comparison children, 88.7% of ADD/ADHD, and 69% of learning disabled (without attentional problems) children. The additional value of incorporation of this discriminant function into a standardized EEG evaluation of individuals suspected of having ADD/ADHD needs to be examined.
Similarly, Kovatchev et al.29 proposed that an EEG consistency index, derived from EEG-spectral data, can further add to diagnostic ability of the EEG in this population. Moreover, early work suggests that the increased theta abnormality frontally may predict response to psychostimulants.30 This finding contradicts the later report by Loo et al.25 suggesting that decreased theta frontally is a better predictor of response to psychostimulants. Other spectral profiles were suggestive of alternative therapies. Finally, very few studies examined the spectral EEG profile in adult ADHD patients.15 Our analysis suggests that the increased theta noted in children may be also detectable in adult patients.
Very few large and multicenter studies are available. Matsuura et al.10 conducted a multinational (WHO-sponsored) study comparing children with healthy behavior, ADD/ADHD, and deviant behavior (non-ADHD or learning disabled) from Japan, Korea and China. They reported EEG spectra to be similar among healthy children from all three national groups. The EEGs of ADHD children exhibited the increased slow wave activity (both theta and delta activity in this study) while normal or behaviorally deviant children did not. A recent review attests to the potential for EEG spectral analysis studies to be useful in differentiating ADHD children from other clinical populations that are likely to be misdiagnosed as ADHD, like children with learning disabilities and children with learned behavioral disturbances.31 Finally, cost-effectiveness studies designed to examine the usefulness of the observation that psychostimulants can normalize either the EEG or event-related potentials (ERP) abnormalities in guiding medication administration should be encouraged.32
It should be noted that in two studies an increase (instead of decrease) in Beta activity was found in similar groups of patients.33,34 Whether a distinct subgroup of ADHD patients with increased beta activity awaits replication of the finding and identification of the clinical correlates of this subgroup, if one indeed exists. Lazzaro et al.,35 while reporting the same finding, suggest that combined EEG and ERP examinations may be more powerful than EEG or ERP examinations alone. The finding of a possible ADHD group with excess beta highlights the need to describe each reported abnormality individually at first before combining these abnormalities to increase the power of the diagnostic tests. An increased beta/theta ratio could result from either increased beta or decreased Theta. Reliance on the ratio thus could blur the distinction between these two subgroups.
The findings from this meta-analysis suggest that the increased theta activity in the EEGs of patients with ADHD is a robust enough finding to warrant further developing as a diagnostic test for ADHD. The data also suggest that computing the relative value power of theta activity to the entire EEG power may be a stronger or more reliable indicator of deviation in this group. A small number of studies compared this finding in patients who would be included in the differential diagnostic process (i.e., children with learning disability or with learned deviant behavior). Monastra et al.36 reported sensitivity of 86% and specificity of 98% when the above profiles are used to diagnose ADHD. Chabot et al.28 reported similarly high sensitivity and specificity values.
A significant problem facing laboratory-based diagnostic procedures is a strong tendency for test to be released for clinical practice before all (or even the majority) of the variables outlined above that are necessary for clinical usefulness have been worked out. The premature release of such tests could lead to disappointment of the medical community and could lead to the premature abandoning of the test, as may have been the case with the dexamethazone suppression test (DST),37 or a severe backlash when limitations of the test, particularly when applied in a noncontrolled setting (e.g., in private clinical practice), become apparent, leading to severe delay in the progress of the proper development and evaluation of the test. A good example of the latter was the premature marketing of quantified-EEG (Q-EEG) to physicians with no training in electrophysiological methodology.38 However, during the last decade, more than 500 EEG and QEEG papers have reported well designed studies, and an overview of this literature reveals numerous consistent QEEG findings among psychiatric patients within the same DSM diagnostic categories.39
Finally, the issue of the "gold standard" is particularly problematic in psychiatry. Whereas in most medical conditions tissue diagnosis is possible, allowing the investigators to have a high degree of certainty regarding the presence or absence of the disorder being tested, such a standard does not exist in psychiatry. Instead, psychiatry relies on the "best estimate diagnosis." This standard is based on agreement among a number of "experts." Data by Roy et al.40 and earlier by Leckman et al.41 have shown that diagnoses based solely on clinical data and using the best available expertise and a multilevel evaluation to arrive at a consensus best estimate diagnosis can only reach a kappa for agreement of 0.69. It seems likely that as genetic and other biological variables are discovered and are associated with clinical phenomena, our diagnostic systems will be redefined. Nonetheless, any potentially diagnostic biological finding should be developed through a systematized approach as is proposed here. In order for a test to be of clinical usefulness, it will have to demonstrate that it will incrementally improve the diagnostic ability of the clinician. The degree of improvement necessary for a test to be considered cost-effective is also a complicated question that depends on the particular disorder and the specific clinical situation. Finally, methodology for demonstrating the ability of a test to improve diagnostic accuracy in psychiatry has not been well developed.
Future studies reporting on the clinical applications of laboratory tests should adhere to the criteria specified by the STARD.5,6 Adherence to such standards is crucial as evidence of design-related bias in studies of diagnostic tests have been reported.42 Interested researchers are referred to a body of work by Somoza et al.43 for an overview.
Two further conclusions can be drawn from the above. First is that in the absence of a guiding systematized approach to test development, promising research findings may not be translated into clinical utility in a timely manner. Conversely, such findings could also be prematurely disseminated for wide clinical use. Both outcomes are undesirable given the necessary balance between patient care needs and economic concerns. Second, it seems reasonable to suggest, based on our review and meta-analysis and available literature, that spectral analysis of the resting EEG can be used to develop a number of indices that can be utilized to aid the diagnosis of ADD/ADHD. A standardized method for subject recruitment, diagnostic evaluation, EEG recording, and EEG analysis (including the derivation of indices of deviance) could be developed and utilized in a number of large multicenter studies. Data from such studies should then help define the clinical application of this methodology. Subsequent refining of the "test" may include expanding the battery to include activated EEG or the addition of evoked response measures. As is the case with most psychiatric disorders, ADD/ADHD can have different etiologies, ranging from genetics to mild head trauma.44 If the identified abnormality detects all cases of the target disorder, this suggests that the identified abnormality reflects a final common pathway for the various etiologies. If, on the other hand, the abnormality is only detected in a particular subgroup (which should be accomplished in Step 3), then the abnormality may be more closely linked to the pathogenic process.
A number of drawbacks decrease the generalizability of the proposed approach. In this article the proposed four-step approach was applied to only one representative biological finding. Second, unpublished negative data could not be included; thus conclusions may be biased in a positive direction. Moreover, we did not attempt to deal with any controversies that exist within this literature that did not directly relate to the value of increased theta diagnostically (e.g., the value of increased or decreased theta in predicting the clinical response to psychostimulants). Finally, it is possible that the review did not include every published paper that would have met inclusion criteria. The above notwithstanding, the conclusions reached seem reasonable and suggest that further exploration of the four-step approach for developing laboratory-based diagnostic tests is warranted.
This study was presented in part at the American Psychiatric Association 157th Annual Meeting, New York, NY, May 1—6, 2004.
This study was partially supported by the VA-Connecticut Healthcare System.