Loading blog...
Loading blog...
A protocol is not a bureaucratic requirement. It is the document that saves a project when the unexpected happens in the field.
Ravi Menon
May 08, 2026•4 min read
The supervisor gets a call on day three of fieldwork. One of the enumerators wants to know whether to interview a household head who was listed in the sampling frame but has been away from the home for two weeks and might not return before data collection ends. Do they wait? Interview a family member? Replace the household? Move on?
If the project has a data collection protocol, the answer is in it. If it does not, the supervisor makes a judgment call. That call might be right. Or it might introduce a systematic deviation from the sampling design that only becomes visible three months later when the data is being analyzed.
A data collection protocol exists to prevent that scenario. Here is what it should contain.
A data collection protocol is a written document that describes every operational aspect of how data will be collected on a project: who collects it, from whom, using what methods and tools, under what conditions data is acceptable, how quality is monitored, and what procedures apply when standard scenarios do not fit.
It is different from the survey instrument (which is the questionnaire or interview guide). It is different from the sampling framework (which describes how respondents are selected). It sits alongside both documents as the operational rulebook for fieldwork.

A brief paragraph describing what the study is measuring, why it matters, and who commissioned it. This is not for the supervisor. It is for the junior enumerator who needs to understand why they are doing this work with enough context to handle unexpected situations with judgment.
Who qualifies for this study and who does not. Written with enough specificity to handle edge cases. For a household welfare survey targeting women aged 18 to 49: what happens if the eligible woman is away during the interview window? What if she is present but ill? What if the household has multiple eligible women? The protocol should answer all of these.
How enumerators find and approach respondents. In rural household surveys: which household in a cluster, using which approach (random route, listing-based, GPS polygon). At what time of day should contact attempts be made? How many attempts before a household is replaced? These are not trivial decisions. Inconsistent contact procedures are one of the most common sources of undetected sampling bias.
The exact process for obtaining consent before an interview begins. What the enumerator says, what the respondent is shown or given, how consent is recorded (signature, verbal agreement captured on audio, digital checkbox), and what happens if consent is refused. This section is particularly important for health and social research but applies to all studies involving human participants.
What form or instrument is used, what platform it runs on (paper, KoboToolbox, SurveyCTO, ODK), and what the enumerator does if the device fails during an interview. Where data is submitted and how often. What happens to partially completed interviews.
How data quality is monitored. Who reviews incoming data and at what intervals. What quality indicators trigger a flag (interview duration below minimum, GPS location outside the study area, logical inconsistency between responses). How errors are corrected. What the back-check procedure is and how many interviews per enumerator will be checked.
A protocol that does not cover what to do when things go wrong is a protocol for the ideal scenario, which almost never exists in fieldwork.
Under what circumstances can a respondent be replaced, and with what. This section prevents enumerators from making ad hoc substitution decisions that introduce systematic bias. Rules should be specific: if the primary respondent is not available after three contact attempts, go to the designated replacement household in the sampling frame. Do not substitute with a nearby convenience household.
What situations the enumerator cannot resolve independently and must escalate to their supervisor. And what the supervisor cannot resolve independently and must escalate to the research team. Having this written down prevents important decisions being made at the wrong level of authority in the field.
Most pilot tests focus on the questionnaire. Fewer pilot the protocol. During a pilot, run through several of the substitution and edge case scenarios described in the protocol with your field team and observe whether the procedures are clear enough to apply consistently. The ambiguities that surface in a protocol pilot are far cheaper to fix than the biases they would otherwise introduce into the main dataset.
Newsletter
Personalize your updates! Subscribe to ProjectBist's Newsletter and choose from the following categories.

Net Promoter Score: What It Actually Measures, How to Calculate It Correctly, and Why Context Matters

How to Conduct an In-Depth Interview (IDI): A Practical Guide for Qualitative Researchers

Product Testing in the Food and Beverage Industry: Methods, Standards, and What Makes It Different