CHAPTER 11 AN INTRODUCTION TO CONTENT ANALYSIS Throughout the preceding chapters, techniques and strategies for collecting and organizing data have been discussed. With a partial exception for Chapters 4, 6, and perhaps 7, where limited analytic procedures are mentioned, analysis of data has not yet been extensively discussed. In this chapter the task of analysis is considered at length. Interviews, field notes, and various types of unobtrusive data are often not amenable to analysis until the information they convey has been condensed and made systematically comparable. An objective coding scheme must be applied to the notes or data. This process is commonly called content analysis. The instructions in this chapter are intended to assist novice researchers in their attempt to learn the methodological technique(s) for standard content analysis. First, a brief discussion of analysis approaches in qualitative research are outlined. Following this, some general concerns and debates regarding content analysis is presented. Then, a number of procedures for analyzing content analysis are discussed. These include consideration of what to count and what to analyze, the nature of levels and units of analysis, and how to effectively employ coding frames. In the next section, the strengths and weaknesses of content analysis as a research technique are discussed, and analytic induction is examined in relation to content analysis procedures. Finally, this chapter will address word crunching, the use of computers in qualitative research. ANALYSIS OF QUALITATIVE DATA There are a number of procedures used by qualitative researchers to analyze their data. Miles and Huberman (1994) identify three major approaches to qualitative data analysis: interpretative approaches, social anthropological approaches, and collaborative social research approaches. AN INTRODUCTION TO CONTENT ANALYSIS 2 3 9 Interpretative Approaches This orientation allows researchers to treat social action and human activity as text. In other words, human action can be seen as a collection of symbols expressing layers of meaning. Interviews and observational data, then, can be transcribed into written text for analysis. How one interprets such a text depends in part on the theoretical orientation taken by the researcher. Thus, a researcher with a phenomenological bent will resist condensing data or framing data by various sorting or coding operations. A phenomenologically oriented researcher might, instead, attempt to uncover or capture the telos (essence) of an account. This approach provides a means for discovering the practical understandings of meanings and actions. Researchers with a more general interpretative orientation (dramaturgists, symbolic interactionists, etc.) are likely to organize or reduce data in order to uncover patterns of human activity, action, and meaning. Social Anthropological Approaches Researchers following this orientation often have conducted various sorts of field or case study activities to gather data. In order to accomplish data collection, they have necessarily spent considerable time in a given community, or with a given assortment of individuals in the field. They have participated, indirectly or directly, with many of the individuals residing in or interacting with the study population. This provides the researcher with a special perspective on the material collected during the research, as well as a special understanding of the participants and how these individuals interpret their social worlds. Analysis of this sort of data can be accomplished by setting information down in field notes, and then applying the interpretative style of treating this information as text. However, frequently this analytic process requires the analysis of multiple sources of data such as diaries, observations, interviews, photographs, and artifacts. Determining what material to include or exclude, how to order the presentation of substantiating materials, and what to report first or last are analytic choices the researcher must make. Researchers employing the social anthropological approach usually are interested in the behavioral regularities of everyday life; language and language use, rituals and ceremonies, and relationships. The analytic task, then, is to identify and explain the ways people use or operate in a particular setting; how they come to understand things; account for, take action, and generally manage their day-to-day life. Many researchers using this approach begin with a conceptual or theoretical frame, then move into the field in order to test or refine this conceptualization. 240 CHAPTER ELEVEN Collaborative Social Research Approaches Researchers operating in this research mode work with their subjects in a given setting in order to accomplish some sort of change or action (see Chapter 7 on action research). The analysis of data gathered in such collaborative studies is accomplished with the participation of the subjects who are seen by the researcher as stakeholders in the situation in need of change or action. Data are collected, and then reflexively considered both as feedback to craft action and as information to understand a situation, resolve a problem, or to satisfy some sort of field experiment. The actual analytic strategies applied in this effort may be similar to the interpretative and social anthropology approaches. Given these diverse yet overlapping approaches, you can see certain facets of research that recur during any style of qualitative analysis. Below is a fairly standard set of analytic activities arranged in a general order of sequence: • Data are collected and made into text (e.g., field notes, transcripts, etc.). • Codes are analytically developed or inductively identified in the data and affixed to sets of notes or transcript pages. • Codes are transformed into categorical labels or themes. • Materials are sorted by these categories, identifying similar phrases, patterns, relationships, and commonalties or disparities. • Sorted materials are examined to isolate meaningful patterns and processes. • Identified patterns are considered in light of previous research and theories, and a small set of generalizations are established. During the remainder of this chapter, these features will be discussed and considered in relationship to content analysis. In the next section, I will consider the nature of content analysis as a technique. CONTENT ANALYSIS AS A TECHNIQUE In content analysis, researchers examine artifacts of social communication. Typically, these are written documents or transcriptions of recorded verbal communications. Broadly defined, however, content analysis is "any technique for making inferences by systematically and objectively identifying special characteristics of messages" (Holsri, 1968, p. 608). From this perspective, photographs, videotape, or any item that can be made into text are amenable to content analysis. In this chapter, objective analysis of messages conveyed in the data being analyzed is accomplished by means of explicit rules called criteria of selection, which must be formally established before the actual analysis of data. The criteria of selection used in any given content analysis must be sufficiently exhaustive to account for each variation of message content and must AN INTRODUCTION TO CONTENT ANALYSIS 2 4 1 be rigidly and consistently applied so that other researchers or readers, looking at the same messages, would obtain the same or comparable results. This may be considered a kind of reliability of the measures, and a validation of eventual findings (Selltiz et al., 1967). The categories that emerge in the course of developing these criteria should reflect all relevant aspects of the messages and retain, as much as possible, the exact wording used in the statements. They should not be merely arbitrary or superficial applications of irrelevant categories. Holsri (1968, p. 598) explains this type of content analysis procedure: "The inclusion or exclusion of content is done according to consistently applied criteria of selection; this requirement eliminates analysis in which only material supporting the investigator's hypotheses are examined." CONTENT ANALYSIS: QUANTITATIVE OR QUALITATIVE? One of the leading debates among users of content analysis is whether analysis should be quantitative or qualitative. Berelson (1952), for example, suggests that content analysis is "objective, systematic, and quantitative." Similarly, Silverman (1993, p. 59) dismisses content analysis from his discussion of qualitative data analysis "because it is a quantitative method." Selltiz et al. (1959, p. 336) however, state that concerns over quantification in content analysis tend to emphasize "the procedures of analysis," rather than the "character of the data available." Selltiz et al. suggest also that heavy quantitative content analysis results in a somewhat arbitrary limitation in the field by excluding all accounts of communications that are not in the form of numbers as well as those that may lose meaning if reduced to a numeric form (definitions, symbols, detailed explanations, photographs, and so forth). Other proponents of content analysis, notably Smith (1975), suggest that some blend of both quantitative and qualitative analysis should be used. Smith (1975, p. 218) explains that he has taken this position "because qualitative analysis deals with the forms and antecedent-consequent patterns of form, while quantitative analysis deals with duration and frequency of form." Abrahamson (1983, p. 286) suggests that "content analysis can be fruitfully employed to examine virtually any type of communication." As a consequence, content analysis may focus on either quantitative or qualitative aspects of communication messages. Some authors of methods books have written about the procedure of narrative analysis as distinguishable from the procedure of content analysis (see, for example, Silverman, 1993; Manning & Cullum-Swan, 1994). In narrative analysis, the investigator typically begins with a set of principles and seeks to exhaust the meaning of the text using specified rules and principles, but maintains a qualitative textual approach (Boje, 1991; Heise, 1992; Manning & 2 4 2 CHAPTER ELEVEN Cullum-Swan, 1994; Silverman, 1993). In contrast to this allegedly more textual approach, content analysis is suggested to be limited to counts of textual elements. Thus, the implication is that content analysis is more reductionistic and ostensively a more positivistic approach. I argue here that content analysis can be effective in qualitative analysis—that "counts" of textual elements merely provide a means for identifying, organizing, indexing, and retrieving data. Analysis of the data once organized according to certain content elements should involve consideration of the literal words in the text being analyzed, including the manner in which these words have been offered. In this way, content analysis provides a method for obtaining good access to the words of the text or transcribed accounts offered by subjects (Glassner & Loughlin, 1987). This offers, in turn, an opportunity for the investigator to learn about how subjects or the authors of textual materials view their social worlds. From this perspective, content analysis is not a reductionistic, positivistic approach. Rather, it is a passport to listening to the words of the text, and understanding better the perspective(s) of the producer of these words. This chapter strives for a blend of qualitative and quantitative analysis: the descriptions of quantitative analysis show how researchers can create a series of tally sheets to determine specific frequencies of relevant categories. The references to qualitative analysis show how researchers can examine ideological mind-sets, themes, topics, symbols, and similar phenomena, while grounding such examinations to the data. Manifest versus Latent Content Analysis Another controversy concerning the use of content analysis is whether the analysis should be limited to manifest content (those elements that are physically present and countable) or extended to more latent content. In the latter case, the analysis is extended to an interpretive reading of the symbolism underlying the physical data. For example, an entire speech may be assessed for how radical it was, or a novel could be considered in terms of how violent the entire text was. Stated in different words, manifest content is comparable to the surface structure present in the message, and latent content is the deep structural meaning conveyed by the message. Holsti (1969, p. 598) has tried to resolve this debate: "It is true that only the manifest attributes of text may be coded, but this limitation is already implied by the requirement of objectivity. Inferences about latent meanings of messages are therefore permitted but... they require corroboration by independent evidence." One reasonable interpretation of this passage, and a similar statement made by Berelson (1952, p. 488ff), suggests that although there are some dangers in directly inferring from latent symbolism, it is nonetheless possible to use it (see also Merton, 1968, pp. 366-370, on the use of content analysis in examining propaganda). To accomplish this sort of "deciphering" (Heilman, 1976) of latent symbolic meaning, researchers must first AN INTRODUCTION TO CONTENT ANALYSIS 2 4 3 incorporate independent corroborative techniques (for example, agreement between independent coders concerning latent content or some noncontent analytic source). Finally, and especially when latent symbolism may be discussed, researchers should offer detailed excerpts from relevant statements (messages) that serve to document the researchers' interpretations. A safe rule of thumb to follow is the inclusion of at least three independent examples for each interpretation. Blending Manifest and Latent Content Analysis Strategies Perhaps the best resolution of this dilemma about whether to use manifest or latent content is to use both whenever possible. In this case, a given unit of content would receive the same attention from both methods—to the extent that coding procedures (discussed presently) for both the manifest and latent content are reasonably valid and reliable (Babbie, 1998). By reporting the frequency with which a given concept appears in text, researchers suggest the magnitude of this observation. It is more convincing for their arguments when researchers demonstrate the appearance of a claimed observation in some large proportion of the material under study (e.g., 20 percent, 30 percent, 40 percent, and so on). Researchers must bear in mind, however, that these descriptive statistics—namely, proportions and frequency distributions—do not necessarily reflect the nature of the data or variables. If the theme "positive attitude toward shoplifting," appears 50 times in one subject's interview transcript and 25 times in another subject's, this would not be justification for the researchers to claim that the first subject is twice as likely to shoplift as the second subject. In short, researchers must be cautious not to take or claim magnitudes as findings in themselves. The magnitude for certain observations is presented to demonstrate more fully the overall analysis. COMMUNICATION COMPONENTS According to Holsti (1969) and Carney (1972), communications have three major components: the message, the sender, and the audience. The message should be analyzed in terms of explicit themes, relative emphasis on various topics, amount of space or time devoted to certain topics, and numerous other dimensions. Occasionally, messages are analyzed for information about the sender of the communication. According to Chadwick et al. (1984), the linkages between the message content and attributes of the sender are often slight. Nonetheless, some characteristics of the sender may be discernible, especially if numerous examples are available, audible (recorded) messages are examined, or verbatim transcriptions from recordings are used (including 244 CHAPTER ELEVEN literal representations of pauses, mispronounced words, grammatical errors, slang, and other language styles). Strauss (1987, p. 33) similarly differentiates between what he calls in vivo codes and sociological constructs. In vivo codes are the literal terms used by individuals under investigation, the terms used by the various actors themselves. "In vivo codes tend to be the behaviors or processes which will explain to the analyst how the basic problem of the actors is resolved or processed" (Strauss, 1987, p. 33). In contrast, sociological constructs are formulated by the analyst. Terms and categories such as professional attitude, family oriented, obsessive workaholic, and educationally minded might represent examples of sociological constructs. These constructs, of course, need not derive exclusively from sociology and may come from the fields of education, nursing, psychology, and the like. Strauss (1987, p. 34) explains that these constructs "are based on a combination of the researcher's scholarly knowledge and knowledge of the substantive field under study." The result of using constructs is the addition of certain social scientific meanings that might otherwise be missed in the analysis. Thus, sociological constructs add breadth and depth to observations by reaching beyond local meanings to broader social scientific ones. Researchers may additionally use content analysis to assess a message's effects on the audience. The Pornography and Television Violence Commissions tried, for example, to assess the impact of sexual or violent material on television and in movies on those who watched this genre of entertainment (Commission on Obscenity and Pornography, 1970; Comstock & Rubinstein, 1972). However, making accurate inferences about either the characteristics of the sender or the effects of the message on the audience is often tenuous at best. WHAT TO COUNT: LEVELS AND UNITS OF ANALYSIS When using a content analysis strategy to assess written documents, researchers must first decide at what level they plan to sample and what units of analysis will be counted. Sampling may occur at any or all of the following levels: words, phrases, sentences, paragraphs, sections, chapters, books, writers, ideological stance, subject topic, or similar elements relevant to the context. When examining other forms of messages, researchers may use any of the preceding levels or may sample at other conceptual levels more appropriate to the specific message. For example, when examining television programs for violent content, researchers might use segments between commercials as the level of analysis, or they might choose to use the entire television program (excluding commercials) as the level (see, for example, Fields, 1988). AN INTRODUCTION TO CONTENT ANALYSIS 2 4 5 CATEGORY DEVELOPMENT: BUILDING GROUNDED THEORY Strauss (1987) describes the considerable misconception surrounding the development of grounded theory. The term misconception, as Strauss (1987, p. 55) points out, seems more appropriate than criticism. Misconception implies an inaccurate reading of material pertaining to building grounded theory. On the other hand, criticism connotes more of a challenge to or detraction from the benefits of this process. Central to misconception are the notions that grounded theory is an entirely inductive process, that it does not verify findings, and that it somehow molds the data to the theory rather than the reverse. Strauss (1987, p. 55), in a lengthy note, singles out Miles and Huberman (1983) as illustrating several instrumental misconceptions (brackets in original text contain Strauss's responses): In Miles and Huberman (1983, p. 57) there is also a misunderstanding about grounded theory technology. The material in my book, written before their publication appeared, runs directly counter to some of their remarks, that: the grounded theory approach has a lot going for it. Data get well molded to the codes that represent them, and we get more of a code-in-use flavor than the generic code-for-many-uses generated by prefabricated start lists.... The tradeoff here is that earlier segments may have different codes than later ones. [They may, in part, of course.] Or to avoid this everything may have to be recorded once a more empirically sculpted scheme emerges. [No.] This means more overall coding time, and longer uncertainty about the coherence of the coding frame. [Probably, but deliberate, in part]. In addition, Miles and Huberman (1983, pp. 63-64) promote the worrisome notion that coding is not an enjoyable task, which suggests that other aspects of the research enterprise are more fun. This text as well as Strauss (1987) strongly disagree. Coding and other fundamental procedures associated with grounded theory development are certainly hard work and must be taken seriously, but just as many people enjoy finishing a complicated jigsaw puzzle, many researchers find great satisfaction in coding and analysis. As researchers move through the coding process and begin to see the puzzle pieces come together to form a more complete picture, the process can be downright thrilling. Time consuming, tiring, and even laborious as the process is, it is seldom boring! The categories researchers use in a content analysis can be determined inductively, deductively, or by some combination of both (Strauss, 1987). Abrahamson (1983, p. 286) indicates that an inductive approach begins with the researchers "immersing" themselves in the documents (that is, the various messages) in order to identify the dimensions or themes that seem meaningful to the producers of each message. In a deductive approach, researchers 2 4 6 CHAPTER ELEVEN use some categorical scheme suggested by a theoretical perspective, and the documents provide a means for assessing the hypothesis. In many circumstances, the relationship between a theoretical perspective and certain messages involves both inductive and deductive approaches. However, in order to present the perceptions of others (the producers of messages) in the most forthright manner, a greater reliance upon induction is necessary. Nevertheless, as will be shown, induction should not be undertaken to the exclusion of deduction. The development of inductive categories allows researchers to link or ground these categories to the data from which they derive. Certainly it is reasonable to suggest that insights and general questions about research derive from previous experience with the study phenomena. This may represent personal experience, scholarly experience (having read about it), or previous research undertaken to examine the matter. Researchers, similarly, draw on these experiences in order to propose tentative comparisons that assist in creating various deductions. Experience thus underpins both inductive and deductive reasoning. From this interplay of experience, induction, and deduction, Glaser and Strauss formulate their description of grounded theory. According to Glaser and Strauss (1967, pp. 2-3): To generate theory... we suggest as the best approach an initial, systematic discovery of the theory from the data of social research. Then one can be relatively sure that the theory will fit the work. And since categories are discovered by examination of the data, laymen involved in the area to which the theory applies will usually be able to understand it, while sociologists who work in other areas will recognize an understandable theory linked with the data of a given area. What to Count Seven major elements in written messages can be counted in content analysis: words or terms, themes, characters, paragraphs, items, concepts, and semantics (Berelson, 1952; Berg, 1983; Merton, 1968; Selltiz et al., 1959). Words. The word is the smallest element or unit used in content analysis. Its use generally results in a frequency distribution of specified words or terms. Themes. The theme is a more useful unit to count. In its simplest form, a theme is a simple sentence, a string of words with a subject and a predicate. Because themes may be located in a variety of places in most written documents, it becomes necessary to specify (in advance) which places will be searched. For example, researchers might use only the primary theme in a AN INTRODUCTION TO CONTENT ANALYSIS 2 4 7 given paragraph location or alternatively might count every theme in a given text under analysis. Characters. In some studies, characters (persons) are significant to the analysis. In such cases, you count the number of times a specific person or persons are mentioned rather than the number of words or themes. Paragraphs. The paragraph is infrequently used as the basic unit in content analysis chiefly because of the difficulties that have resulted in attempting to code and classify the various and often numerous thoughts stated and implied in a single paragraph. Items. An item represents the whole unit of the sender's message—that is, an item may be an entire book, a letter, speech, diary, newspaper, or even an in-depth interview. Concepts. The use of concepts as units to count is a more sophisticated type of word counting than previously mentioned. Concepts involve words grouped together into conceptual clusters (ideas) that constitute, in some instances, variables in a typical research hypothesis (Sanders & Pinhey 1959, p. 191). For instance, a conceptual cluster may form around the idea of deviance. Words such as crime, delinquency, kiting, and fraud might cluster around the conceptual idea of deviance (Babbie, 1998). To some extent, the use of a concept as the unit of analysis leads toward more latent than manifest content. Semantics. In the type of content analysis known as semantics, researchers are interested not only in the number and type of words used but also in how affected the word(s) may be—in other words, how strong or weak a word (or words) may be in relation to the overall sentiment of the sentence (Sanders & Pinhey, 1959). Combinations of Elements In many instances, research requires the use of a combination of several content analytic elements. For example, in my study (Berg, 1983) to identify subjective definitions for Jewish affiliational categories (Orthodox, Conservative, Reform, and Nonpracticing), I used a combination of both item and paragraph elements as a content unit. In order to accomplish a content analysis of these definitions (as items), I lifted every respondent's definitions of each affiliational category verbatim from an interview transcript. Each set of definitions was additionally annotated with the transcript number from which it had been taken. Next, each definition (as items) was separated into its component definitional paragraph for each affiliational category. An example of this definitional paragraphing is shown below (Berg, 1983, p. 76): 2 4 8 CHAPTER ELEVEN INTERVIEW #60: ORTHODOX Well, I guess, Orthodox keep kosher in [the] home and away from home. Observe the Sabbath, and, you know . . . , actually if somebody did [those] and considered themselves an Orthodox Jew, to me that would be enough. I would say that they were Orthodox. INTERVIEW #60: CONSERVATIVE Conservative, I guess, is the fellow who doesn't want to say he's Reform because it's objectionable to him. But he's a long way from being Orthodox. INTERVIEW #60: REFORM Reform is just somebody that, they say they are Jewish because they don't want to lose their identity. But actually I want to be considered a Reform, 'cause I say I'm Jewish, but I wouldn't want to be associated as a Jew if I didn't actually observe any of the laws. INTERVIEW #60: NONPRACTICING Well, a Nonpracticing is the guy who would have no temple affiliation, no affiliation with being Jewish at all, except that he considers himself a Jew. I guess he practices in no way, except to himself. Units and Categories Content analysis involves the interaction of two processes: specification of the content characteristics (basic content elements) being examined and application of explicit rules for identifying and recording these characteristics. The categories into which you code content items vary according to the nature of the research and the particularities of the data (that is, whether they are detailed responses to open-ended questions, newspaper columns, letters, television transcripts, and so on). As with all research methods, conceptualization and operationalization necessarily involve an interaction between theoretical concerns and empirical observations. For instance, if researchers wanted to examine newspaper orientations toward changes in a state's seat-belt law (as a potential barometer of public opinion), they might read newspaper articles and/or editorials. As they read each article, the researchers could ask themselves which ones were in favor of and which ones were opposed to changes in the law. Were the articles' positions more clearly indicated by their manifest content or by some undertone? Was the decision to label one article pro or con based on the use of certain terms, on presentation of specific study findings, or because of statements offered by particular characters (for example, celebrities, political figures, and so on)? The answers to these questions allow the researchers to develop inductive categories in which to slot various units of content. AN INTRODUCTION TO CONTENT ANALYSIS 2 4 9 As previously mentioned, researchers need not limit their procedures to induction alone. Both inductive and deductive reasoning may provide fruitful findings. If, for example, investigators are attempting to test hypothetical propositions, their theoretical orientation should suggest empirical indicators of concepts (deductive reasoning). If they have begun with specific empirical observations, they should attempt to develop explanations grounded in the data (grounded theory) and apply these theories to other empirical observations (inductive reasoning). There are no easy ways to describe specific tactics for developing categories or to suggest how to go about defining (operationalizing) these tactics. To paraphrase Schatzman and Strauss's (1973, p. 12) remark about methodological choices in general, the categorizing tactics worked out—some in advance, some developed later—should be consistent not only with the questions asked and the methodological requirements of science but also with a relation to the properties of the phenomena under investigation. Stated succinctly, categories must be grounded in the data from which they emerge (Denzin, 1978; Glaser & Strauss, 1967). The development of categories in any content analysis must derive from inductive reference (to be discussed in detail later) concerning patterns that emerge from the data. For example, in a study evaluating the effectiveness of a Florida-based delinquency diversion program, I (Berg, 1986) identified several thematic categories from information provided on intake sheets. By setting up a tally sheet, I managed to use the criminal offenses declared by arresting officers in their general statements to identify two distinct classes of crime, in spite of arresting officers' use of similar-sounding terms. In one class of crime, several similar terms were used to describe what amounted to the same type of crime. In a second class of crime, officers more consistently referred to the same type of crime by a consistent term. Specifically, I found that the words shoplifting, petty theft, and retail theft each referred to essentially the same category of crime involving the stealing of some type of store merchandise, usually not exceeding $3.50 in value. Somewhat surprisingly, the semantically similar term petty larceny was used to describe the taking of cash whether it was from a retail establishment, a domicile, or an auto. Thus, the data indicated a subtle perceptual distinction made by the officers reporting juvenile crimes. Recently, Dabney (1993) examined how practicing nurses perceived other nurses who worked while impaired by alcohol or drugs. He developed several thematic categories based on previous studies found in the literature. He was also able to inductively identify several classes of drug diversion described by subjects during the course of interviews. For instance, many subjects referred to stockpiled drugs that nurses commonly used for themselves. These drugs included an assortment of pain killers and mild sedatives stored in a box, a drawer, or some similar container on the unit or floor. These stockpiled drugs 250 CHAPTER ELEVEN accumulated when patients died or were transferred to another hospital unit and this information did not immediately reach the hospital pharmacy. Classes and Categories Three major procedures are used to identify and develop classes and categories in a standard content analysis and to discuss findings in research that use content analysis: common classes, special classes, and theoretical classes. Common Classes. The first are the common classes of a culture in general. These classes are used by virtually anyone in society to distinguish between and among persons, things, and events (for example, age, gender, mother, father, teacher, and so on). These common classes, as categories, provide for lay people a means of designation in the course of everyday thinking and communicating, and to engender meaning in their social interactions (see Duncan, 1962; Schatzman & Strauss, 1973; Strauss, 1959). These common classes are essential in assessing whether certain demographic characteristics are related to patterns that may arise during a given data analysis. Special Classes. Special classes are those labels used by members of certain areas (communities) to distinguish among the things, persons, and events within their limited province (Schatzman & Strauss, 1973). These special classes can be likened to jargonized terms used commonly in certain professions but not by lay people. Alternatively, these special classes may be described as out-group versus in-group classifications. In the case of the out-group, the reference is to labels conventionally used by the greater (host) community or society; as for the in-group, the reference is to conventional terms and labels used among some specified group or that may emerge as theoretical classes. Theoretical Classes. The theoretical classes are those that emerge in the course of analyzing the data (Schatzman & Strauss, 1973). In most content analysis, these theoretical classes provide an overarching pattern (a key linkage) that occurs throughout the analysis. Nomenclature that identifies these theoretical classes generally borrows from that used in special classes and, together with analytically constructed labels, accounts for novelty and innovations. According to Schatzman and Strauss (1973), these theoretical classes are special sources of classification because their specific substance is grounded in the data. Because these theoretical classes are not immediately knowable or available to observers until they spend considerable time going over the ways respondents (or messages) in a sample identify themselves and others, it is necessary to retain the special classes throughout much of the analysis. The next problem to address is how to identify various classes and categories in the data set, which leads to a discussion of open coding. AN INTRODUCTION TO CONTENT ANALYSIS 2 5 1 OPEN CODING Inexperienced researchers, although they may intellectually understand the process described so far, usually become lost at about this point in the actual process of coding. Some of the major obstacles that cause anguish include the so-called true or intended meaning of the sentence and a desire to know the real motivation behind a subject's clearly identifiable lie. If the researchers can get beyond such concerns, the coding can continue. For the most part, these concerns are actually irrelevant to the coding process, particularly with regard to open coding, the central purpose of which is to open inquiry widely. Although interpretations, questions, and even possible answers may seem to emerge as researchers code, it is important to hold these as tentative at best. Contradictions to such early conclusions may emerge during the coding of the very next document. The most thorough analysis of the various concepts and categories will best be accomplished after all the material has been coded. The solution to the novice investigators' anguish, then, as suggested by Strauss (1987, p. 28) is to "believe everything and believe nothing" while undertaking open coding. Strauss (1987, p. 30) suggests four basic guidelines when conducting open coding. These are: (1) ask the data a specific and consistent set of questions, (2) analyze the data minutely, (3) frequently interrupt the coding to write a theoretical note, and (4) never assume the analytic relevance of any traditional variable such as age, sex, social class, and so forth until the data show it to be relevant. A detailed discussion of each of these guidelines follows. 1. Ask the data a specific and consistent set of questions. The most general question researchers must keep in mind is, What study are these data pertinent to? In other words, what was the original objective of the research study? This is not to suggest that the data must be molded to that study. Rather, the original purpose of a study may not be accomplished and an alternative or unanticipated goal may be identified in the data. For example, in Pearson's (1987) evaluation of a New Jersey intensive problem supervision program, the original aim was to demonstrate cost effectiveness. Although objective indicators failed to support the cost effectiveness of the experimental program, several indirect indicators suggested that the program nonetheless was fairly successful. These other measures involved repeated reports from relatives of probationers about changes in attitudes demonstrated by the program participants. For instance, the wife of one participant reported that her husband had begun to send child-support payments in full and on time. Parents of another program participant reported that their child had begun to show personal responsibility by doing household chores around the home—something the individual had previously never undertaken. Thus, Pearson (1987) points to an unanticipated benefit from the program. This illustration demonstrates the need both to keep the original study