Agreement among raters is of great importance in many domains. For example, in medicine, diagnoses are often provided by more than one doctor to make sure the proposed treatment is optimal. In criminal trials, sentencing depends, among other things, on the complete agreement among the jurors. In observational studies, researchers increase reliability by examining discrepant ratings. This book is intended to help researchers statistically examine rater agreement by reviewing four different approaches to the technique.
The first approach introduces readers to calculating coefficients that allow one to summarize agreements in a single score. The second approach involves estimating log-linear models that allow one to test specific hypotheses about the structure of a cross-classification of two or more raters' judgments. The third approach explores cross-classifications or raters' agreement for indicators of agreement or disagreement, and for indicators of such characteristics as trends. The fourth approach compares the correlation or covariation structures of variables that raters use to describe objects, behaviors, or individuals. These structures can be compared for two or more raters. All of these methods operate at the level of observed variables. This book is intended as a reference for researchers and practitioners who describe and evaluate objects and behavior in a number of fields, including the social and behavioral sciences, statistics, medicine, business, and education. It also serves as a useful text for graduate-level methods or assessment classes found in departments of psychology, education, epidemiology, biostatistics, public health, communication, advertising and marketing, and sociology. Exposure to regression analysis and log-linear modeling is helpful.