| Accountability | The responsibility to provide evidence to stakeholders and funding agencies that something is effective and delivers on its stated goals, legal, fiscal and other requirements. |
| Accuracy | The extent to which an evaluation is truthful or valid. |
| Activities | The actual events or actions that take place as a part of the service, programme or project. |
| Attribution | The estimation of the extent to which any results observed are caused, meaning that A has produced actual effects on B. |
| Breadth | The scope of the measurement's coverage. |
| Case study | A data collection method that involves in-depth studies of specific cases or projects. The method itself is made up of one or more data collection methods (such as interviews and document review). |
| Causal inference | The logical process used to draw conclusions from evidence concerning what has been produced or 'caused'. To say that something produced or caused a certain result means that, if that something had not been there (or if it had been there in a different form or degree), then the observed result (or level of result) would not have occurred. |
| Comparison group | A group not exposed to a service or project. Also referred to as a control group. |
| Conclusion validity | The ability to generalise the conclusions about a service to other places, times, or situations. Both internal and external validity issues must be addressed if such conclusions are to be reached. |
| Confidence level | A statement that the true value of a parameter for a population lays within a specified range of values with a certain level of probability. |
| Control group | In quasi-experimental designs, a group of subjects who receive all influences except the service in exactly the same fashion as the participant group. Also referred to as a non-service group. |
| Cost-benefit analysis | An analysis that combines the benefits of a service with the costs of the service. The benefits and costs are transformed into monetary terms. |
| Cost-effectiveness analysis | An analysis that combines service costs and effects (impacts). However, the impacts do not have to be transformed into monetary benefits or costs. |
| Data collection method | The way that information about an initiative and its outcomes are amassed. |
| Depth | A measurement's degree of accuracy and level of detail. |
| Evaluation design | The model or conceptual framework used to arrive at conclusions about outcomes. |
| Evaluation plan | A written document describing the overall approach or design that will be used to guide an evaluation. It includes what will be done, how it will be done, who will do it, when it will be done, why the evaluation is being conducted and how the findings will likely be used. |
| Evaluation strategy | The method used to gather evidence about one or more outcomes . An evaluation strategy is made up of an evaluation design, a data collection method, and an analysis technique. |
| Executive summary | A summary statement designed to provide a brief overview of the full-length report on which it is based. |
| Expert opinion | A data collection method that involves using the perceptions and knowledge of experts in specific areas as indicators of service outcome. |
| External validity | The ability to generalise conclusions to future or different conditions. |
| File review | A data collection method involving a review of service files or documentation. There are usually two types of service files: general service files and files on individual projects, clients, or participants. |
| Focus group | A group of people selected for their relevance to an evaluation that is engaged by a trained facilitator in a series of discussions designed for sharing insights, ideas, and observations on a topic of concern. |
| Goal | A statement of the overall mission or purpose(s) of the service. |
| Indicator | A specific, observable, and measurable characteristic or change that shows progress toward achieving a specified outcome. |
| Inputs | Resources that go in to implement activities successfully. |
| Internal validity | The ability to assert that something has caused measured results (to a certain degree), in the face of plausible potential alternative explanations. |
| Interview guide | A list of issues or questions to be raised in the course of an interview. |
| Interviewer bias | The influence of the interviewer on the interviewee. This may result from several factors, including the physical and psychological characteristics of the interviewer, which may affect the interviewees and cause differential responses among them. |
| Kaupapa Māori | Approaches based in traditional Māori values and ways of being, relating and doing. |
| Literature search | A data collection method that involves an identification and examination of written material such as research reports, published papers and books. |
| Logic model | A systematic and visual way to present the perceived relationships among the resources you have to operate the service, the activities you plan to do, and the changes or results you hope to achieve. |
| Longitudinal data | Data collected over a period of time, sometimes involving a stream of data for particular persons or entities over time. |
| Natural observation | A data collection method that involves on-site visits to locations where a service is operating. It directly assesses the setting of a service, its activities and individuals who participate in the activities. |
| Objective data | Observations that do not involve personal feelings and are based on observable facts. Objective data can be measured quantitatively or qualitatively. |
| Objectivity | Evidence and conclusions that can be verified by someone other than the original authors. |
| Order bias | A skewing of results caused by the order in which questions are placed in a survey. |
| Outcome evaluation | The systematic collection of information to assess the impact of a service, present conclusions about the merit or worth of a service, and make recommendations about future service direction or improvement. |
| Outcomes | The results of service operations or activities; the effects triggered by the service, such as increased knowledge, changed attitudes or beliefs. |
| Outputs | The direct products of service activities; immediate measures of what the service did. |
| Population | The set of units to which the results of a survey apply. |
| Primary data | Data collected by an evaluation team specifically for the evaluation study. |
| Process evaluation | The systematic collection of information to document and assess how a service was implemented and operates. |
| Programme evaluation | The systematic collection of information about the activities, characteristics, and outcomes of programmes to make judgments about the service, improve service effectiveness, and/or inform decisions about future development. |
| Qualitative data | Observations that are categorical rather than numerical, and often involve knowledge, attitudes and perceptions. |
| Quantitative data | Observations that are numerical or based in objective fact. |
| Reliability | The extent to which a measurement, when repeatedly applied to a given situation consistently produces the same results if the situation does not change between the applications. Reliability can refer to the stability of the measurement over time or to the consistency of the measurement from place to place. |
| Resources | Assets available and anticipated for operations. They include people, equipment, facilities,and other things used to plan, implement, and evaluate programmes. |
| Sample size | The number of people to be sampled. |
| Secondary data | Data collected and recorded by another (usually earlier) person or organisation, usually for different purposes than the current evaluation. |
| Stakeholders | People or organisations that are invested in the service or that are interested in the results of the evaluation or what will be done with results of the evaluation. |
| Statistical analysis | The manipulation of numerical or categorical data to predict phenomena, draw conclusions about relationships among variables or generalise results. |
| Statistically significant effects | Effects that are observed and are unlikely to result solely from chance variation. These can be assessed through the use of statistical tests. |
| Stratified sampling | A probability sampling technique that divides a population into layers called strata, and selects appropriate samples independently in each of those layers. |
| Subjective data | Observations that involve personal feelings, attitudes, and perceptions. Subjective data can be measured quantitatively or qualitatively. |
| Surveys | A data collection method that involves a planned effort to collect data from a sample (or a complete census) of the relevant population. The relevant population consists of people or entities affected by the service (or of similar people or entities). |
| Utility | The extent to which an evaluation produces and disseminates reports that inform relevant audiences and have has a beneficial impact on their work. |