Among the benefits of atypical antipsychotic medications is their apparent ability to enhance cognitive impairments in patients with schizophrenia.1 Consistent moderate effects have been detected, although some of studies that found these benefits could be challenged on methodological grounds,2 One of the limitations of prior literature on this topic is the short duration of many of the studies. Most of these studies were acute efficacy trials of 6 to 8 weeks. While these studies were long enough to detect benefits relative to baseline treatments or conventional antipsychotic comparators, it remains unclear whether improvement would continue, or at least be sustained, in longer studies.
One reason for evaluating the long-term benefits of atypical antipsychotic treatments is that short-term treatments appear to induce only moderate cognitive benefits. As reviewed previously, the average effect size of most improvements in these short-term studies was moderate at most and variable across cognitive domains. This moderate improvement is in marked contrast to the typical levels of impairment seen in patients with schizophrenia, which are often severe to very severe.3 For example, with the average level of impairment in episodic memory functions being, on average, below normative standards (SD = 1.5 to SD = 2.5),4 a moderate improvement of SD = 0.4 on average5—6 would not leave the typical patient even close to the lower limit of normal functioning.
Previous studies that examined patients treated with atypical medications over longer periods of time were either not double-blind7 or limited in sample size. For example, there has been only one double-blind study that lasted 26 weeks or more, and8 that study began with 65 patients, with only 55 contributing prospective cognitive data. Among those 55, most did not complete the study, and the sample for the assessments beyond 30 weeks included seven and 12 patients, respectively for the two atypical medications in the study. Despite the drop out rate, improvements for patients receiving treatment with olanzapine were close to a full SD at endpoint. A recent randomized, open-label study9 comparing haloperidol and quetiapine for 6 months also had a relatively small sample size, but also found improvements in quetiapine treated patients that were a full SD from baseline at the endpoint. Thus, in contrast to short-term studies, studies that lasted over 6 months have found considerably larger cognitive improvements. The question that arises from these two previous studies, however, is whether these larger improvements are influenced by methodological factors.
Regardless of the size of the improvement, one of the most important issues in cognitive enhancement is the endpoint level of functioning. While improvements of a full SD for the group of treated patients as a whole are clearly large, a group mean change of this magnitude could mean that some patients fail to improve and also that even patients who improved by a full SD could still have considerable remaining cognitive impairment. Further, large improvements on the part of a few patients could inflate the group average change, particularly in smaller studies. Finally, some evidence has suggested that patients with less impairment are more likely to have greater improvements,5,10 suggesting that many patients with greater levels of impairment may not improve substantially. It is important to be able to establish the extent to which individual patients improve with treatment, in order to estimate the likelihood that an individual patient would receive a meaningful benefit from treatment.
The present report presents a study that addresses many of the previous methodological concerns. First, this was a 6-month randomized double-blind clinical trial, which was an extension of a shorter 6-week, double-blind, comparative acute treatment study.11 Second, the patients’ level of baseline impairment across several different cognitive measures was determined through the use of normative standards for the tests and the level of improvement of these patients was evaluated in terms of putative "normalization" of cognitive performance on individual measures. Thus, improvements could be examined in several different ways: in terms of the statistical significance of global improvements for the group as a whole, effect sizes for individual tests based on normative standards for the group as a whole, and in terms of the likelihood of individual subjects experiencing treatment-related changes in their performance to a level consistent with performance in the normal range.
In this study, patients who had responded, defined as a Clinical Global Impression improvement of at least 2 points and at least a 20% reduction in Positive and Negative Syndrome Scale (PANSS)12 scores, to 6 weeks of double-blind treatment with ziprasidone or olanzapine were offered the opportunity to continue their double-blind treatment for a subsequent 6-month period. Cognitive assessments were performed at baseline, end of the 6-week clinical trial, and 6-month extension (or early termination). Using published norms and the extensive information available on the patients in the trial, normative standards based on age, education, gender, and ethnicity were developed. All patients were characterized as to whether they manifested impairment on each of the different cognitive measures at baseline, with impairments defined according to the suggestions of Taylor and Heaton,13 by setting impairment on any individual measure as a standardized (z) normatively derived score of less than −1.0 on that measure. Then, using these same standards, patients’ performance at endpoint was characterized. Patients were considered to have "normalized" cognitive function for any of the measures if they met a two-part criterion: a) impaired at baseline and unimpaired at endpoint, and b) performance on that measure improved from baseline (SD = 0.5, at a minimum). Thus, putative normalization of a cognitive parameter required both an improvement in test scores and endpoint performance at a level that is at approximately the 17th percentile of the normal distribution or greater.
This was a 6-month extension of a 6-week, multicenter, double-blind, parallel-designed, randomized, controlled study initiated in hospitalized patients. All aspects of the study design have been presented in the report on the first 6 weeks of double blind treatment.11 The following section reviews those methodological details in order to characterize the initial sample of subjects.
Inclusion and Exclusion Criteria
The trial included patients between 18 and 55 years of age. Patients were required to have a primary diagnosis of schizophrenia, or schizoaffective disorder (any subtype, other than partial or full remission) as defined in DSM—IV (295.X, 295.70) and persistent psychotic symptoms for the week before hospital admission. At screening for the acute treatment study, patients were required to have scores ≥ 4 on the Clinical Global Impression Severity Scale (CGI-S), and scores ≥ 4 on at least one of the following PANSS Positive Symptom Scale Items: delusions, conceptual disorganization, or hallucinatory behavior. Patients were required to have normal laboratory test results and electrocardiograms (ECGs), and to have a negative urine drug screen at study entry. The study was approved by the appropriate institutional review boards at every research site, and all patients or their legal guardians provided written informed consent before patients entered the study. Patients who had failed to respond to two adequate treatment trials with antipsychotic medications in the past year were excluded, as were patients judged by the investigator as being at significant risk of suicide, violent behavior, or homicide. Patients with>14 days total lifetime exposure to ziprasidone or olanzapine and those who had discontinued use of either drug due to lack of efficacy or adverse event, were excluded. These exclusion criteria were intended to help ensure that subjects were not partially treatment resistant or intolerant to either agent; this was particularly important given that olanzapine had been available for several years at the time the study was conducted.
Of the patients randomized to ziprasidone, 52% (70) completed the 6-week study; similar frequencies of patients randomized to olanzapine completed the protocol (63%: 84 completers). In order to qualify for the extension, the patient’s CGI-C score had to have improved by at least 2 points compared to baseline and their total CGI had to be less than 4, while their PANSS total scores had to decrease by 20% or more. Of the 70 ziprasidone completers, 62 met eligibility criteria for the extension, 50 agreed to participate and 39 provided complete data. Of the 84 olanzapine completers, 71 met entry criteria for the extension and 55 agreed to participate and 33 provided complete data. Descriptive information on the patients who provided complete data at the 6-month extension is presented in t1.
Changes from baseline were assessed in all patients who met entry criteria and had a cognitive assessment at least 3 months after the end of the short-term study. Cognitive outcome variables were selected for their relevance to functional outcomes in schizophrenia and included change from baseline in scores on tests of memory, executive function, and verbal fluency. The authors are presenting data only from standardized neuropsychological tests with extensive normative data. The authors performed additional experimental tests (The Continuous Performance Test14 and the Digit Span Distraction Test)15 that do not have similar normative standards and as a consequence, those results are not reported.
Trail Making Test (TMT) Part A. In this test of visuomotor speed16 the time required to complete this condition was the dependent variable.
Verbal Memory: The Rey Auditory Verbal Learning Test. This test17 has been used extensively in neuropsychological studies of patients with schizophrenia. A 15-item list of words is presented to the patient in each of 5 separate learning trials. Following this, a distractor list (list B) is read to the subjects, followed by short-delay free recall. A delayed free-recall test is performed after a 20-minute period, followed by a choice recognition procedure. Dependent variables selected for analysis are the total number of words recalled in the five learning trials and at delayed recall, as well as recognition discrimination.
Wisconsin Card Sorting Test (WCST).18 This test assesses executive functioning, cognitive flexibility, maintenance of a cognitive set, and working memory. The critical dependent variables for this trial are the number of categories completed, the total number of errors, and the number of perseverative errors.
Trail Making Test Part B.
This second part of the TMT evaluates both visuomotor speed and the ability to alternate between sets. The dependent variable is the time required to complete the task.
Verbal Fluency Examination.
Two verbal fluency tests were administered: category and letter fluency.17 Category fluency involves naming animals, fruits, and vegetables, while letter fluency examines naming of words starting with the letters F, A, and S. Total scores for category and letter fluency are the dependent variables.
Overall Assessment Methods
Research assistants who had been trained in the administration of these tests by the first author performed all of the assessments. Training sessions occurred in small group settings at the local sites, as well as in regional meetings and the raters’ performance was evaluated at a site visit performed by an advanced graduate student in clinical psychology. All were required to perform valid testing, as certified by case record form review on site, before any patients were examined. A central monitoring facility evaluated case record forms and all forms with errors were returned for correction. If the problems were the results of errors in administration, the patient involved was excluded from further analysis of the data.
The main clinical assessment was the PANSS. PANSS was administered at baseline, weekly to the end of the 6-week acute treatment study, and at the study endpoint.
Characterization of neuropsychological performance status.
All raw scores were standardized into z-scores (mean = 0, SD = 1) using published age- and/or education-adjusted normative standards.18—20 All z-scores were adjusted so that negative scores reflect impairment, even for variables that are reverse scored (i.e., Trail Making Test, WCST Total Errors, and WCST Perseverative Errors). Each subjects’ corrected score at all time points, on each variable, was then categorized by level of impairment. Using the criteria described by Taylor and Heaton, the authors used a criterion of z < −1.0 as the cutoff for the lower boundary of normal neuropsychological performance. This criterion is based on the empirical finding that this cutoff maximizes both specificity and sensitivity of detection of individuals with cognitive impairments. For characterizing improvement, the authors used changes in normatively derived standard scores as our index of change and viewed changes in performance from baseline to endpoint of greater than or equal to z = 0.5 as reflecting "potentially meaningful" change. The authors characterized any participant whose baseline score on any measure was z < -1.0 and whose endpoint score was z ≥ -1.0, with a difference of at least z ≥ 0.5 as a subject with putative "normalized" performance. Thus, individuals could only be considered to be "normalized" if they met both parts of two-part criterion at their final assessment.
Statistical tests on all data were performed at the 5% two-tailed significance level. Multivariate tests for differences between the treatment groups on the set of cognitive battery measures were conducted using Multivariate Analysis of Variance (MANOVA). Raw score and overall effect size results at baseline, week six, and endpoint are reported. A repeated measures ANOVA model, with time of assessment (baseline, week 6, and endpoint) and treatment condition (ziprasidone, olanzapine) was employed. One of the issues affecting the interpretation studies of cognitive change with atypical antipsychotic treatment is that of practice effects. While practice effects appear reduced in schizophrenia,21 they are still possible. Practice effects on memory tests primarily occur at the first reassessment of patients with schizophrenia,22 with changes from the first to second reassessment nonsignificant. In order to separate potential practice effects (effects of retesting at the first reassessment) from putative continued improvement (improvements at the next reassessment), the authors examined the relative size of the change scores from baseline to week 6 and from week 6 to the endpoint. In order to perform this reassessment, the authors used a repeated-measures ANOVA for each variable, examining changes during each of the two periods as a repeated measure and treatment as a fixed factor. Correlations between change scores for the PANSS total and positive and negative subscales, as well as several safety measures, the Barnes Akathisia Scale (BAS),23 the Extrapyramidal Symptom Rating Scale (ESRS),24 the Abnormal Involuntary Movements Scale) (AIMS),25 and change scores for all of the cognitive measures were calculated with Pearson correlations. Finally, neuropsychological normalization was compared across the two treatment conditions with the Chi-square test.
Between-group differences in cognitive improvement.
The first analyses compared patients’ cognitive test scores across the subjects who did not enter the extension study and those who entered the study. T tests were used to compare scores on each of the cognitive variables at baseline and at the end of the acute treatment study across these two groups. None of the t tests were significant for either baseline (all t <0 0.86, p > 0.05) or the end of the acute treatment period (t < 1.24, p > 0.05), indicating that patients who entered the study were similar in their cognitive performance to those who declined to enter. Similar analyses were used to compare cognitive performance at baseline and the end of acute treatment across patients who entered the extension study and either did or did not contribute an endpoint assessment data. Similar to the previous analyses, there were no significant differences between patients who provided endpoint data and those who did not (t < 1.64, p > 0.05). Performance score results for all of the cognitive dependent measures are presented in t2. These are LOCF scores, but 34/39 ziprasidone patients completed the full 6 months and 30/33 olanzapine patients completed as well. These scores are presented as normatively adjusted scores (Raw score data are available from the first author). The first analysis on change scores (baseline to endpoint, LOCF analysis) was performed with a Multivariate Analysis of Variance (MANOVA) on all cognitive variables compared across the two medication treatment conditions. The results of this analysis were not statistically significant (Wilks λ = 0.61, Pillais Approx. [F =1.43, df = 10, 62p=0.31]) reflecting no significant overall differences between the medications in the extent to which ziprasidone or olanzapine improved cognitive functioning.
Cognitive changes over time with treatment.
Scores for cognitive performance and the results of the t tests examining change from baseline within each treatment group are presented in t2. Effect sizes are presented as Cohen’s d. These effect sizes are computed on the basis of the normative adjustments, similar to the z-scores that are presented in the Table. The repeated-measures ANOVAs examining time effects and the condition x time interaction yielded similar results across all of the variables. The time effects were all statistically significant (F2, 69>3.47, p < 0.038), reflecting improvement in every variable from baseline to endpoint as well as the composite score, while no interactions of treatment x time of assessment were significant (F <1.69, df = 2,69, p>0.19), reflecting no treatment-related differences in the rate of change over the 6-month treatment protocol.
Continued Change from end of the acute treatment study to the end of the extension study.
The results of the repeated measures ANOVA on changes during each of the two treatment periods x drug treatment again found no significant treatment group x time interactions (F <1.52, df = 1,70, p>0.20). Composite performance manifested a significant improvement effect from the end of the acute treatment study to the end of the extension study (F =6.92, df = 1,70, p=0.014) and no significant group x time interaction (F =1.49, df = 1,70, p=0.23). For the individual cognitive variables, changes from week 6 to endpoint were significant (df = 1,70) for WCST categories (F = 6.43, p=0.014) and total errors (F = 6.82, p = 0.012) and for total learning (F = 14.47, p < 0.001) and delayed recall (F =6.85, p = 0.12) on the RAVLT.
Normalization of cognitive performance.
t3 presents each of the cognitive variables in terms of the proportion of cases who were unimpaired and impaired at baseline and the extent to which these variables changed with treatment. As can be seen in the table, a substantial number of the patients experienced a normalization of their cognitive performance according to the criteria described above. Based on the fact that 17% of normative cases would be expected to manifest "impaired" performance at baseline (using a cutoff of z<1.0 as impaired), tests of the significance of the difference between proportions were performed for each variable in each of the two groups. These tests compared the observed level of impairment at baseline and at endpoint to the 17% base rate prevalence of impairment expected in a normative sample. For both the olanzapine and ziprasidone patients at baseline, the proportion of cases who were impaired was significantly greater than baseline expectations, all z>2.81, all p < 0.005. At endpoint, performance on all variables other than category fluency (z = 1.55, p = 0.17) and Trail Making Part B performance (z = 1.67, p = 0.10) was significantly greater than the 17% base-rate prevalence (z>1.98, p<0.05).
In these analyses, change scores for each of the cognitive variables described above were correlated with change scores for the safety variables and the PANSS positive subscale, PANSS negative subscale, and PANSS total scores. These noncognitive outcome measures were collected at the same time points as the cognitive assessments as noted above. There were statistically significant improvements for all of these clinical variables for patients receiving both study medications. All correlations were computed within each treatment group separately
For the PANSS positive and negative symptoms, there were no statistically significant correlations between changes in improvements in symptoms and cognitive improvements, with a total of 20 correlations computed per treatment. For PANSS total changes, similar results were found, with one of 20 correlations found to be statistically significant (change in Trail Making Test Part B and change in total PANSS scores for olanzapine patients (r=0.36, p<0.05). This correlation would not have survived Bonferroni correction. For the ESRS total score, AIMS total score, and BAS total scores, there were no significant correlations between baseline to endpoint changes and any of the cognitive change scores.
The results of this 6-month double-blind randomized extension trial are consistent with previous reports of the long-term effects of atypical antipsychotic treatment, although this study has several methodological advantages relative to those studies. Consistent with previous studies of the acute effects of olanzapine, risperidone, quetiapine, and clozapine, long-term treatment is associated with substantial cognitive benefits. The size of these changes is relatively large compared to short-term treatment studies. For example, for several of the variables examined, the level of improvement in performance for the patient group during the 6-month study was close to double that seen in typical short-term studies. The magnitude of short-term change in the previously published 6-week study was quite consistent with that reported in other large-scale double-blind studies comparing atypical antipsychotic medications.5—6 While some of the changes in the present study seem quite large, it is important to note that the aggregate level of change across measures is quite consistent with the change scores reported by Velligan et al. (d=1.0)9 in their open study of 6 months of treatment with quetiapine using a smaller assessment battery. In fact, if verbal fluency scores were not considered, the aggregate level of change would be over SD = 1.0 for both treatment groups. Also consistent with many previous studies5—6,10—11 there was little correlation between clinical improvement and cognitive changes, suggesting independent dimensions of change. When comparing the size of the cognitive change effects in the current study to previous research, it is important to consider the similarity of the cognitive assessments. Most of the previous double-blind studies finding smaller effects of treatment used essentially identical assessments of processing speed (the Trail Making Test), verbal fluency, verbal learning, and executive functioning (see Keefe et al.1 and Harvey and Keefe2 for estimates of the effect sizes for previous changes) Thus, the greater effects in this study are not due to differences in instrumentation. For the majority of the cognitive variables for both treatment groups, it was found that the increases in the proportion of patients performing in the normal range at the end of the study were statistically significant. Further, using conservative criteria for normalization of performance, as many as 41% of the patients (depending on the measure) manifested a change in their performance that took them from outside the normal range of functioning to inside the normal range. When global cognitive functioning was examined, the overall change with treatment was quite substantial, especially when considering that verbal fluency performance was not markedly affected by either treatment. The number needed to treat (NNT) statistic for changing from impaired to unimpaired functioning across a variety of different cognitive measures ranges from a high of 18 (normalization of letter fluency with ziprasidone treatment) to a low of 2.5 (normalization of delayed recall with ziprasidone treatment). The average NNT across the different domains of cognitive impairment and the two treatments is approximately 4. This is a very low NNT by the typical standards of evidence-based medicine and compares very favorably with the NNT for treatment with clozapine to prevent a significant suicide attempt in schizophrenia (NNT=13)26 and is also considerably better than the benefits of aspirin treatment to reduce cardiac events in individuals with a history of MI. Although these data indicate that olanzapine and ziprasidone have relatively similar cognitive effects, this finding does not mean that these drugs would be expected to have equivalent benefits for all patients. For instance, patients switched from risperidone, olanzapine, or conventional antipsychotic medications because of lack of efficacy or side effects experienced improvement in several cognitive domains.27 Patients switched from olanzapine to ziprasidone in that study experienced improvements in variables associated with metabolic syndrome as well.28 These data suggest that medications that appear equivalent on a group-mean basis may not be the optimal treatment for every patient with schizophrenia and that changing medications may lead to changes in clinical, cognition, and safety variables. There are some limitations of this design that require mention. First, only patients who experienced a clinical benefit from atypical antipsychotic treatment were entered into the extension, clearly influencing the sample of patients who were candidates for the study. This procedure may not be markedly different from clinical treatment algorithms, however, in that patients who fail to respond with clinically apparent reduction in their symptoms to 6 weeks of atypical antipsychotic treatment are often considered for a change in medications. An additional 20% of the patients did not provide endpoint cognitive data. Second, the sample size for the extension study is relatively smaller than that of the earlier acute treatment study. While this reduces the likelihood of finding significant between-groups treatment effects, the sample size was adequate to identify statistically significant changes in cognitive test performance from baseline across the two different treatment periods. Third, because there have been few successes in improving cognition in patients with schizophrenia, there are no consensus criteria for normalization of functioning. Our criteria may not be the only way to define neuropsychological normalization. Finally, it is possible that these changes are at least partially due to retesting effects. This is an important issue in many clinical trials, because use of placebo conditions is problematic and the result is often a parallel study comparing two approved treatments. This issue does require careful consideration. The results of Hawkins and Wexler22 found that practice effects on the CVLT were twice as large from the baseline to first reassessment as they were from the first to second reassessments, with those changes not significant on a two-tailed basis. Further, they found no practice effects on the Trail Making Test. Also, the patients in the Hawkins and Wexler study were involved in concurrent cognitive remediation programs and performed much better on the memory assessment at baseline than the patients in this study. Second, recent data have suggested that patients with schizophrenia have more modest practice effects than those expected in the healthy population. In fact, the authors29 have recently shown that older patients with schizophrenia treated with conventional antipsychotic treatments did not improve to a statistically significant extent on any test in a 21-test cognitive evaluation. Focusing on memory performance, the change from the baseline assessment to retest was not statistically significant, and an effect size of d=0.08 was reported, which is very small compared to the effect size reported in the Hawkins and Wexler study. Finally, a systematic study of practice effects on attentional performance indicated that patients with schizophrenia treated with low doses of conventional medications failed to manifest any appreciable improvement in performance with over 8000 practice trials.30 That said, increasing the ability of patients with schizophrenia to improve substantially in cognitive functioning with practice and exposure is a goal of many interventions for patients with schizophrenia.31 For instance, the goal of many cognitive remediation efforts is improvement in cognitive functioning, often as the results of extensive practice on cognitive measures.32 Meta-analyses of the results of these interventions have often suggested that these practice effects are not substantial.33 However, a recent study34 has indicated that extensive cognitive training may induce normalization of memory functioning for some proportion of patients with schizophrenia, although there were no data regarding pharmacological status the patients presented in that paper. Thus, pharmacological interventions that facilitate behavioral or cognitive practice-related learning are a desirable treatment goal. This study does not demonstrate that these treatment-related changes are associated with improvements in functional outcome. In fact, the Velligan et al. study found that quality of life indices improved more substantially with long-term atypical treatments than scores on a functional status rating scale. It should be noted that the rating scale used by Velligan et al. the Heinrichs-Carpenter Quality of Life Scale (QLS),35 has elements of both subjective QoL as well as objective indices of functional skills performance such as frequency of interpersonal interaction. At the same time, in order for these improvements in cognition to have functional significance, it must be demonstrated that the changes are related to objective indicators of improved community adjustment.
Consistent with earlier reports from open-label and smaller-scale double-blind studies, long-term treatment with atypical antipsychotic medications is associated with sustained improvement over time and relatively larger improvements in cognitive functioning than those detected in short-term studies. In terms of the clinical relevance of these findings, the majority of patients had cognitive performance within the normal range of functioning across all measures at the endpoint, and this finding was due to "normalization" of cognitive functioning on the part of a significant subset of those with cognitive impairments at baseline. These results must be viewed as preliminary because of our inability to discriminate practice effects from the direct effects of treatment. Subsequent research will be required in order to substantiate the functional importance of these enhancements in cognitive functioning, as well as to determine predictors of improvement in cognitive functioning.
This study was supported by Pfizer, Inc.Antony Loebel, M.D. is a full-time employee of Pfizer, Inc.Dr. Harvey has served as a consultant for and received a research grant from Pfizer, Inc.
TABLE 1. Descriptive Characteristics of the Sample
TABLE 2. Cognitive Performance From Baseline to 6-Month Endpoint on Normatively Derived Scores
TABLE 3. Neuropsychological Impairment and Normalization With Treatment