The first step in the Delphi method involves defining the research question. In this example, the question is how patients with ERPRA should be identified and classified. What patient demographics (eg, age at diagnosis) should be considered, and what clinical characteristics (eg, C-reactive protein, sedimentation rate, duration of RA, and number of prior treatment failures) should be weighed when classifying a patient as having ERPRA? These are the questions that would form the basis of the survey sent to experts in the field.
During survey development, the research team must decide whether to pose very broad, open-ended questions to encourage brainstorming and a wide range of responses on the topic or whether to pose narrower questions warranting specific and focused responses. The scope of the questions should be informed by the existing literature and current clinical practice, eg, if the clinical characteristics of ERPRA assessment are known, questions in the Delphi technique could focus on establishing cutoffs for each characteristic. The first round of questions should be broad in order to elicit a wide range of responses. It is the subsequent task of the researchers to identify common themes, discard irrelevant information, and tailor the next round of questions to narrow the responses.
The survey would be sent to experts in the field, which in this example, is comprised of practicing rheumatologists. Participants would remain anonymous throughout the rounds of surveys to allow for honest responses that are not influenced by any one participant who might carry significant weight in the field. In subsequent rounds, a participating expert would be more willing to revise their responses or opinions after digesting the summary of prior rounds if anonymity is maintained. It is in this way that responses in subsequent rounds would converge to form a consensus.
Experts can be identified in a number of ways. Professional relationships in a given field may be sufficient to identify a dozen experts. If a larger panel is needed, membership to professional societies may provide access to a greater number of potential expert panelists. The hundreds of clinicians who attend Cardinal Health Specialty Solutions Summits represent yet another potential pool of experts in the fields of urology, rheumatology, or oncology.
The number of experts needed in a Delphi exercise depends on several factors:
- Heterogeneity of the field(s) where results will be implemented (eg, a field comprised of several subspecialties, resulting in guidelines that would apply to multiple specialties).
- Internal and external validity.
- The size of the field itself.
If the diagnostic criteria being developed by the Delphi method will be used across various specialties, a greater number of expert panelists may be needed to account for the anticipated variability in responses. Related are the considerations given to internal and external validity: if the expert panel consists of half of the total experts in the field, it is likely to be more generalizable and, thus, accepted by other experts who were not involved in the Delphi exercise. The sample sizes of Delphi experts typically range from a dozen experts to over a hundred experts. Once the survey has been fielded and responses are obtained, common themes are identified, and responses are grouped thematically. Irrelevant data is also discarded: a response from one expert that does not appear related to a response from any other expert should be considered for exclusion in subsequent rounds in order to guide the panel to consensus. In addition to the research team, which should include at least one clinician, review by a key opinion leader may facilitate this phase of the Delphi process. Quantitative analyses of the responses will provide summaries of the proportion of experts responding in a particular way.
The next round of the Delphi exercise should be developed with the goal of narrowing the focus of the questions based on results from the previous round. For example, if the first round identified three clinical characteristics to assess in the classification of ERPRA, two of which reached consensus, the next round should focus on soliciting feedback on the item that did not reach consensus (eg, asking whether it should be considered for assessment of ERPRA) as well as feedback on relevant cutoffs for the items that reached consensus (eg, positive/negative, present/absent, and values of 3.2 or greater). Where possible, items posed initially as open-ended questions should be transformed into dichotomous or polytomous variables in subsequent rounds of surveys.
The rounds of surveys continue until consensus is reached on all questions being posed. Several cutoffs have been used to identify when a question has reached consensus. Responses reported by > 80% of the experts can be considered as having reached consensus.3 Alternative cutoffs may use 66.7% of the panel to determine consensus.5 Using the former cutoff threshold, similar responses reported by > 10% but < 80% of respondents should be included in subsequent rounds of Delphi surveys. When values are solicited, the variation in values reported by the experts should be considered. For example, an interquartile range of ≤ 2 units may be defined as having reached consensus. Once all responses associated with the research question attain consensus, the Delphi process can cease, and a summary of quantitative analyses can form the basis of expert recommendations on the given topic.
Although the recommendations for a particular clinical practice may have achieved expert consensus in the Delphi process, validation of the guidelines further strengthen the evidence supporting the recommendations and aid in their adoption. There are many aspects of validity, including construct validity, discriminative validity, sensitivity to change, and feasibility. Patient data containing the items of the consensus guidelines (eg, age at diagnosis, CRP levels, and swollen joint count) can be used to assess the validity of the expert guidelines. Analyses may include principal component analysis, estimates of association between the guidelines and other clinical measures, and an assessment of the feasibility to implement the consensus guidelines. Finally, a validated patient outcome (or patient characteristic in the case of diagnostic criteria) can be used in observational research.