DSM-5: Finding a Middle Ground
DSM-5: Finding a Middle Ground
This year's American Psychiatric Association (APA) annual meeting was probably the last before the publication of the Diagnostic and Statistical Manual of Mental Disorders, fifth edition (DSM-5), scheduled for May of next year. Hence, there was a sense of tense uncertainty in the many sessions addressing potential DSM-5 revisions.
DSM-5 Task Force Vice Chair Darrel Regier headed a symposium reviewing results of field trials on the reliability of proposed DSM-5 criteria. The trials were meant to assess whether clinicians can use the proposed criteria consistently and provided kappa values for the individual proposals.
Kappa values reflect the agreement in a rating by 2 different persons, after correction for chance agreement. From a statistical perspective, kappa values greater than 0.5 are generally considered good. As an example, 70% agreement between raters translates to a kappa value of 0.4.
Results of the field trials showed good agreement for such disorders as major neurocognitive disorder, autism spectrum disorders, and post-traumatic stress disorder, with kappa values of 0.78, 0.69, and 0.67, respectively. However, poor kappa values, in the range of 0.20-0.40, were reported for commonly diagnosed conditions, such as generalized anxiety disorder and major depressive disorder. All of the observed kappa values in the DSM-5 field trials translate to agreement between clinicians of around 50%.
Is this good or bad? A recent editorial by DSM-5 leaders makes comparisons with other medical settings, and the claim is that most medical diagnoses involve diagnostic kappa values similar to those in the DSM-5 field trials. I spoke with prominent psychiatrists at this year's meeting who were involved in some of these DSM studies and discussions; they expressed unhappiness with the kappa values in DSM-5 field trials, and some pointed out that kappa values in the DSM-III were higher.
So, the reliability of DSM-5 criteria seems to have declined compared to DSM-III. Is this a problem? It might be, but it might not be.
Reliability only means that we agree. It doesn't mean that we agree on what is right. Validity is a separate issue. It could be that criteria are changed so that they are more valid -- that is, actually true -- but this could increase unreliability; raters might have to use, for instance, some criteria that are less objective and hence less replicable.
We will see. DSM-5 might be more valid but less reliable than DSM-IV and DSM-III. If so, that's progress, in a way.
It is also important to think about other medical studies with low reliability. We should be careful about criticizing certain diagnoses, such as bipolar disorder (as some have), without an awareness that this is the case for almost all our diagnoses. The problem of reliability is a general one, not a problem about claimed "overdiagnosis" of some conditions.
In my view, it is definitely time for a new edition of DSM; we can't pretend that something written almost 2 decades ago is anywhere near up to date, with a generation of new research. Some of the proposed changes in DSM-5 -- for example, the inclusion of antidepressant-induced mania as part of bipolar disorder; the inclusion of dimensions for axis II personality conditions; and the removal of nosologically nonspecific axis II diagnoses, such as "histrionic" personality -- are consistent with an update based on convincing new research. But other changes, such as the wish to discourage the diagnosis of childhood bipolar disorder by making up a new category based on limited data (temper dysregulation disorder), merely repeat the mistakes of DSM-IV. Making up diagnoses because we don't like others is not a scientifically sound way to revise a profession's diagnostic system, and it won't serve us well for the next 20 years.
DSM-5: Validity vs Reliability
This year's American Psychiatric Association (APA) annual meeting was probably the last before the publication of the Diagnostic and Statistical Manual of Mental Disorders, fifth edition (DSM-5), scheduled for May of next year. Hence, there was a sense of tense uncertainty in the many sessions addressing potential DSM-5 revisions.
DSM-5 Task Force Vice Chair Darrel Regier headed a symposium reviewing results of field trials on the reliability of proposed DSM-5 criteria. The trials were meant to assess whether clinicians can use the proposed criteria consistently and provided kappa values for the individual proposals.
Kappa values reflect the agreement in a rating by 2 different persons, after correction for chance agreement. From a statistical perspective, kappa values greater than 0.5 are generally considered good. As an example, 70% agreement between raters translates to a kappa value of 0.4.
Results of the field trials showed good agreement for such disorders as major neurocognitive disorder, autism spectrum disorders, and post-traumatic stress disorder, with kappa values of 0.78, 0.69, and 0.67, respectively. However, poor kappa values, in the range of 0.20-0.40, were reported for commonly diagnosed conditions, such as generalized anxiety disorder and major depressive disorder. All of the observed kappa values in the DSM-5 field trials translate to agreement between clinicians of around 50%.
Is this good or bad? A recent editorial by DSM-5 leaders makes comparisons with other medical settings, and the claim is that most medical diagnoses involve diagnostic kappa values similar to those in the DSM-5 field trials. I spoke with prominent psychiatrists at this year's meeting who were involved in some of these DSM studies and discussions; they expressed unhappiness with the kappa values in DSM-5 field trials, and some pointed out that kappa values in the DSM-III were higher.
So, the reliability of DSM-5 criteria seems to have declined compared to DSM-III. Is this a problem? It might be, but it might not be.
Reliability only means that we agree. It doesn't mean that we agree on what is right. Validity is a separate issue. It could be that criteria are changed so that they are more valid -- that is, actually true -- but this could increase unreliability; raters might have to use, for instance, some criteria that are less objective and hence less replicable.
We will see. DSM-5 might be more valid but less reliable than DSM-IV and DSM-III. If so, that's progress, in a way.
It is also important to think about other medical studies with low reliability. We should be careful about criticizing certain diagnoses, such as bipolar disorder (as some have), without an awareness that this is the case for almost all our diagnoses. The problem of reliability is a general one, not a problem about claimed "overdiagnosis" of some conditions.
In my view, it is definitely time for a new edition of DSM; we can't pretend that something written almost 2 decades ago is anywhere near up to date, with a generation of new research. Some of the proposed changes in DSM-5 -- for example, the inclusion of antidepressant-induced mania as part of bipolar disorder; the inclusion of dimensions for axis II personality conditions; and the removal of nosologically nonspecific axis II diagnoses, such as "histrionic" personality -- are consistent with an update based on convincing new research. But other changes, such as the wish to discourage the diagnosis of childhood bipolar disorder by making up a new category based on limited data (temper dysregulation disorder), merely repeat the mistakes of DSM-IV. Making up diagnoses because we don't like others is not a scientifically sound way to revise a profession's diagnostic system, and it won't serve us well for the next 20 years.