BACKGROUND LEVEL, RATE The concentration, often low, at which some substance, agent, or event is present or occurs at a particular time and place in the absence of a spe- cific hazard or set of hazards under investigation. An example is the background level of the naturally occurring forms of ionizing radiation to which we are all exposed.

BACTERIA (singular: bacterium) Single-celled organisms found throughout nature, which can be beneficial or cause disease.

BAR CHART (Syn: bar diagram) A graphic technique for presenting discrete data organ- ized in such a way that each observation can fall into one and only one category of the variable. Frequencies are listed along one axis and categories of the variable along the other axis. The frequencies of each group of observations are represented by the lengths of the corresponding bars. See also histogram.

BAR DIAGRAM See bar chart.
BARKER HYPOTHESIS See developmental origins hypothesis.
BARRIER METHOD Contraceptive method that interposes a physical barrier between

sperm and ovum (e.g., condom, cervical cap, diaphragm).
BARRIER NURSING (Syn: bedside isolation) Nursing care of hospital patients that mini-

mizes the risks of cross-infection by use of antisepsis, gowns, gloves, masks for nursing staff, and isolation of the patient, preferably alone in a single room. See also universal precautions.

BASELINE DATA A set of data collected at the beginning of a study.
BASE POPULATION See population, source.
BASE, STUDY See study base.
BASIC REPRODUCTIVE RATE (R0) A measure of the number of infections pro- duced, on average, by an infected individual in the early stages of an epidemic, when virtually all contacts are susceptible. (Some authors use the symbol Z0 for basic reproductive rate.)

BAYESIAN STATISTICS A method of statistical inference that begins with formulation of probabilities of hypotheses (called prior probabilities) before the data under analysis are taken into account. It then uses the data and a model for the data probability (usu- ally the same model used by other methods, such as a logistic model) to update the probabilities of the hypotheses. The resulting updated probabilities are called posterior probabilities. Central to this updating is Bayes’ theorem,35 although not all Bayesian methods require explicit use of the theorem and not all uses of the theorem are Baye- sian methods. Bayesian statistics can be used alongside or in place of other methods for many purposes (e.g., evaluation of diagnostic tests, studies of disease progression, and analyses of geographic studies, clinical trials, cohort studies, and case-control studies).

BAYES’ THEOREM A theorem of probability named for Thomas Bayes (1702–1761), an English clergyman and mathematician; his Essay Towards Solving a Problem in the Doctrine of Chances (1763, published posthumously) contained this theorem. In epide- miology, it is often used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease (the prior probability of disease) and of the likelihoods of that characteristic in healthy and diseased indi- viduals. The most familiar application is in clinical decision analysis, where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result. A simplified version of the theorem is

where D = disease, S = symptom, and D– = no disease. The formula emphasizes what clinical intuition often overlooks—namely, that the probability of disease given this symptom depends not only on how characteristic of the disease that symptom is but also on how frequent the disease is among the population being served.

The theorem can also be used for estimating exposure-specific rates from case-control studies if there is added information about the overall rate of disease in that population.

Some of the terms in the theorem are named. The probability of disease given the symptom is the posterior probability. It is an estimate of the probability of disease posterior to knowing whether or not the symptom was present. The overall probability of disease among the population or our guess of the probability of disease before knowing of the presence or absence of the symptom is the prior probability. The theorem is sometimes presented in terms of the odds of disease before knowing the symptom (prior odds) and after knowing the symptom (posterior odds).

BEDSIDE ISOLATION See barrier nursing.
BEHAVIORAL EPIDEMIC An epidemic attributable to the power of suggestion or to culturally determined behavioral patterns (as opposed to invading microorganisms or physical agents). Examples include the dancing manias of the Middle Ages, episodes of mass fainting or convulsions (“hysterical epidemics”), crowd panic, and waves of fash- ion or enthusiasm. The communicable nature of the behavior is dependent not only on person-to-person transmission of the behavioral pattern but also on group reinforce- ment (as with smoking, alcohol, and drug use). Behavioral epidemics may be difficult to differentiate from outbreaks of organic disease (e.g., due to contamination of the environment by a toxic substance) or may complicate them.

BEHAVIORAL RISK FACTOR A characteristic or behavior that is associated with increased probability of a specified outcome; the term does not imply a causal relationship.

BEHAVIOR SETTING The place where a pattern or sequence of behavior regularly occurs; it includes the ordinary events of daily life.18 A forerunner of the concept of activity setting.

BENCHMARK A slang or jargon term, usually meaning a measurement taken at the out- set of a series of measurements of the same variable, sometimes meaning the best or most desirable value of the variable.

BENEFICENCE Literally, doing good. In bioethics, a principle underlying utilitarian approaches. It implies a certain obligation to promote benefits of things judged to be good, typically balancing potential or produced goods against risks. In public health, it implies acting in the best interest of the population at stake.36,37

BENEFIT Advantage or improvement resulting from an intervention.
BENEFIT-COST RATIO See cost-benefit analysis.
BERKSONIAN BIAS (Syn: Berkson’s bias, Berkson fallacy) A form of selection bias

arising when both the exposure and the disease under study affect selection. In its clas- sical form, it causes hospital cases and controls in a case-control study to be system- atically different from one another.38 This occurs when the combination of exposure and disease under study increases the probability of admission to hospital, leading to a systematically higher exposure rate among hospital cases than among hospital controls; the process hence biases the odds ratio.

Bernoulli distribution 18

BERNOULLI DISTRIBUTION The probability distribution associated with two mutu- ally exclusive and exhaustive outcomes—e.g., death or survival; a Bernoulli variable is one that has only two possible values—e.g., death or survival. See also binomial distribution.

BERTILLON CLASSIFICATION The first numerically based nosology in which disease entities were arranged in chapters, developed by Jacques Bertillon (1851–1922).39 It descended from a nosology proposed in 1853 by Marc d’Espigne and William Farr. Bertillon’s classification was adopted at the International Statistical Institute (confer- ence) in Chicago in 1893 and was the progenitor of the International Classification of Diseases (ICD).

BETA ERROR See error, type ii.
BIAS Systematic deviation of results or inferences from truth. Processes leading to such

deviation. An error in the conception and design of a study—or in the collection, analy- sis, interpretation, reporting, publication, or review of data—leading to results or con- clusions that are systematically (as opposed to randomly) different from truth.5–12,14,31–34

Ways in which deviation from the truth can occur include:

  1. Systematic variation of measurements from the true values (syn: systematic
    measurement error).
  2. Variation of statistical summary measures (means, rates, measures of association,
    etc.) from their true values as a result of systematic variation of measurements, other
    flaws in study conduct and data collection, flaws in study design, or analysis.
  3. Deviation of inferences from truth as a result of conceptual or methodological flaws in study conception or design, data collection, or the analysis or interpretation of
  4. A tendency of procedures (in study design, data collection, analysis, interpretation,
    review, or publication) to yield results or conclusions that depart from the truth.
  5. Prejudice leading to the conscious or unconscious selection of research hypotheses or procedures that depart from the truth in a particular direction or to one-sidedness
    in the interpretation of results.

The term bias does not necessarily carry an imputation of prejudice or any other subjective

factor, such as the experimenter’s desire for a particular outcome. This differs from conventional usage, in which bias refers to a partisan point of view—to prejudice or unfairness.

BIAS, ASCERTAINMENT See ascertainment bias. BIAS, BERKSON’S See Berkson’s bias.
BIAS, CONFOUNDING See confounding bias. BIAS, WORKUP See workup bias.

BIAS DUE TO DIGIT PREFERENCE See digit preference.
BIAS DUE TO INSTRUMENT ERROR Systematic error due to faulty calibration, inac-

curate measuring instruments, contaminated reagents, incorrect dilution or mixing of

reagents, etc. See also contamination, data.
BIAS DUE TO WITHDRAWALS A difference between the true effect and the association

observed in a study due to characteristics of subjects who choose to withdraw. See also

attrition; censoring; dropout.
BIAS IN ASSUMPTIONS (Syn: conceptual bias) Error arising from faulty logic or

premises or mistaken beliefs on the part of the investigator. False conclusions about the explanation for associations between variables. Example: Having correctly deduced the

19 Binary variable

mode of transmission of cholera, John Snow concluded that yellow fever was transmit- ted by similar means. In fact, the “miasma” theory would have been a better fit for the facts of yellow fever transmission. See also biological plausibility; coherence.

BIAS IN AUTOPSY SERIES Systematic errors resulting from the fact that autopsies represent a nonrandom sample of all deaths.

BIAS IN HANDLING OUTLIERS Error arising from biased discarding of unusual values or due to exclusion of unusual values that should be included.

BIAS IN THE PRESENTATION OF DATA Error due to irregularities produced by digit preference, incomplete data, poor techniques of measurement, technically poor laboratory procedures, or intentional attempts to mislead.

BIAS IN PUBLICATION See publication bias.
BIAS OF AN ESTIMATOR The difference between the expected value of an estimator of

a parameter and the true value of this parameter. See also unbiased estimator.
BIAS OF INTERPRETATION See interpretive bias.
BIBLIOGRAPHIC IMPACT FACTOR (BIF) In scientometrics, a useful measure of the

“average” frequency with which articles in a scientific periodical are cited by articles in journals that are chosen by the Thomson Corporation to be indexed in the Science Citation Index (SCI) and related databases.40,41

Given the limited properties of the BIF, even when properly applied to journals, and the well-known fact that scientific articles may have a wide spectrum of impacts (or none),41 it is clear that the cultural impact of the “impact factor” in the academic community has less to do with scientific rationality than with the sociology of scientific knowledge (and human nature). Attributing bibliometric indicators for journals to articles or to individual authors is a form of the ecological fallacy.

The BIF has virtues and limitations. Main reasons why, even for a given journal, BIF is often not the scientometric indicator of choice include the following: BIF is extremely influenced by the number of “source items” or “citeable articles” chosen as the denominator of the BIF (i.e., by the number of articles chosen by Thomson among articles published in the journal in the previous 2 years); such articles are not disclosed; criteria used by Thomson to decide which articles are included and excluded in the denominator of the BIF are unknown, and so is the consistency of their application across journals; citations to articles excluded from the denominator of the BIF are nevertheless counted in the numerator.

If the journal is the unit of analysis, the total number of citations received by all articles published by such journal may be a better indicator. If the bibliographic “impact” of an article is of interest, the total number of citations received by the article may the best place to start.41

BILLS OF MORTALITY Weekly and annual abstracts of christenings and burials com- piled from parish registers in England, especially London, that date from 1538. Begin- ning in 1629, the annual bills were published and included a tabulation of deaths from plague and other causes. These were the basis for the earliest English vital statistics, compiled, analyzed, and discussed by John Graunt (1620–1674) in Natural and Political Observations … on the Bills of Mortality (London, 1662).

BIMODAL DISTRIBUTION A distribution with two regions of high frequency separated by a region of low frequency of observations. A two-peak distribution.

BINARY VARIABLE A variable having only two possible values (e.g., on or off, 0 or 1). See also measurement scale.

Binomial distribution 20

BINOMIAL DISTRIBUTION A probability distribution associated with two mutually exclusive outcomes (e.g., presence or absence of a clinical or laboratory sign, death or survival). The probability distribution of the number of occurrences of a binary event in a sample of n independent observations. The binomial distribution may be used to model cumulative incidence rates and prevalence. The Bernoulli distribution is a special case of the binomial distribution with n = 1.

BIOACCUMULATION Progressive increase in the concentration of a chemical compound in an organism, organ, or tissue when the rate of uptake exceeds the rate of excretion or metabolism. In humans, exposure to and bioaccumulation of persistent chemical agents occurs largely through the fatty components of animal foods, including recycled animal fats from slaughterhouses, which are used as components of food products and animal feed ingredients. Bioaccumulation occurs within a trophic (food chain) level. See also biomagnification.

BIOASSAY The quantitative evaluation of the potency of a substance by assessing its effects on tissues, cells, live experimental animals, or humans. Bioassay may be a direct method of estimating relative potency: groups of subjects are assigned to each of two (or more) preparations, the dose that is just sufficient to produce a specified response is measured, and the estimate is the ratio of the mean doses for the two (or more) groups. In this method, the death of the subject may be used as the “response.” The indirect method (more commonly used) requires study of the relationship between the magni- tude of a dose and the magnitude of a quantitative response produced by it. See also interaction.

BIODIVERSITY (Syn: biological diversity) The variety of species of plants, animals, and microorganisms in a natural community, of communities within a particular environ- ment, and of genetic variation within a species (genetic diversity). Biodiversity is important for the stability of ecosystems. To many individuals worldwide it is also a cultural value.


  1. An attribute of body tissue that is relevant in pathogenesis; e.g., “age” of breast tissue, which develops after puberty, in relation to breast cancer risk.42 See also Armitage-Doll model.
  2. People age with different “speed” at equal “calendar age.” Some people are physically older than others, and this is expressed in external appearance, body characteristics, or physical and social functioning. The concept is applied in the form of the calculation of the biological age of the subject by multiple regression.

BIOLOGICAL MONITORING (Syn: biomonitoring) Performance, analysis, and interpre- tation of biological measurements aimed at detecting changes (often adverse) in the health status of populations, in an environmental compartment (including water, air or soils), or in other health determinants (e.g., food samples, animal feed). Monitoring of concentrations of suspected or known toxic or hazardous substances using biological means in well-defined populations (e.g., analyses of concentrations of environmental chemical agents in samples of urine, blood, or adipose tissue). Examples include the U.S. National Reports on Human Exposure to Environmental Chemicals (www.cdc. gov/exposurereport) and the German Environmental Surveys (www.umweltbunde See also monitoring; surveillance.

BIOLOGICAL PLAUSIBILITY The causal criterion or consideration that an observed, presumably causal association is plausible on the basis of existing biomedical knowledge.

21 Birth cohort

On a schematic continuum including possible, plausible, compatible, and coherent, the term plausible is not a demanding or stringent requirement, given the many biological mechanisms that often may underlie clinical and epidemiological observations; hence, in assessing causality, it may be logically more appropriate to require coherence (bio- logical as well as clinical and epidemiological). The criterion of biological plausibility should hence be used cautiously, since it could impede development of new knowl- edge that does not fit existing biological evidence or pathophysiological reasoning. Innovative, valid, and relevant clinical and epidemiological discoveries may precede the acquisition of knowledge on their biological mechanisms; i.e., biologically relevant epidemiological evidence may precede biological evidence. In evaluating associations between genetic variants and common complex diseases, we should fully expect bio- logically meaningful associations with small clinical or epidemiological effects.43 See also coherence; Hill’s criteria of causation.

BIOLOGICAL TRANSMISSION See vector-borne infection.
BIOMAGNIFICATION (Syn: biological magnification, bioamplification) Sequence

of processes in an ecosystem by which higher concentrations (e.g., of a persistent toxic substance) are attained in organisms at higher levels in the food chain. The increase in concentration of an element or compound, such as a pesticide, that occurs in a food chain. Biomagnification occurs across trophic (food chain) levels. See also bioaccumulation.

BIOMARKER, BIOLOGICAL MARKER A cellular, biochemical, or molecular indicator of exposure; of biological, subclinical, or clinical effects; or of possible susceptibility (e.g., biomarkers of internal dose, biologically effective dose, early biological response, altered structure, altered function). It is occasionally an ambiguous term that sug- gests insufficient understanding of the pathophysiological or mechanistic role of the “marker.” See also molecular epidemiology.

BIOMETRY Literally, measurement of life. The application of statistical methods to the study of numerical data based on observation of biological phenomena. The term was made popular by Karl Pearson (1857–1936), who founded the journal Biometrika. The British biologist Francis Galton (1822–1911) has been described as the founder of biometry, but others—e.g., the Frenchman Pierre-Charles-Alexandre Louis (1787– 1872)—preceded him.

BIOMONITORING See biological monitoring.

BIOSTATISTICS Application of statistics to biological problems. The term should not be restricted to mean the application of statistics to medical problems (medical statis- tics), since its real meaning is broader, subsuming agricultural statistics, forestry, and ecology, among other applications.

BIRTH CERTIFICATE Official, legal document recording details of a live birth, usually comprising name, date, place, identity of parents, and sometimes additional information such as birth weight. It provides the basis for vital statistics of birth and birthrates in a political or administrative jurisdiction and for the denominator for infant mortality and certain other vital rates.

BIRTH COHORT The location of a person in historical time as indexed by his or her year of birth. Birth cohorts are often differentially affected by social events. Numerous cohort variations in factors that have long-term effects on health (e.g., childbearing, smok- ing, physical activity) have been documented. Cohort effects are easiest to distinguish when disease trends have accelerated, decelerated, or changed direction; where they are steady and linear, they can hardly be distinguished reliably from period effects.16,31 See also developmental and life course epidemiology; life course.

BIRTH COHORT ANALYSIS See cohort analysis.

BIRTH INTERVAL Time interval between termination of one completed pregnancy and the termination of the next. Time interval between the birth of one offspring and the birth of the next offspring of the same mother.

BIRTH INTERVAL, CLOSED This applies to the population of women who gave birth to two or more living children: it counts only birth intervals between two completed pregnancies (i.e., the interval is closed by next pregnancy).

BIRTH INTERVAL, OPEN This applies also to the population of women who gave birth to two or more living children, but it counts only birth intervals after completed preg-nancies.

BIRTH ORDER The ordinal number of a given live birth in relation to all previous live births of the same woman. Thus, 4 is the birth order of the fourth live birth occurring to the same woman. This strict demographic definition may be loosened to include all births, i.e., stillbirths as well as live births. More loosely, the ranking of siblings according to age, starting with the eldest in a family.

BIRTHRATE A summary rate based on the number of live births in a population over a given period, usually 1 year.

Demographers refer to this as the crude birthrate.

BIRTH WEIGHT Infant’s weight recorded at the time of birth and, in some countries, entered on the birth certificate. Certain variants of birth weight are precisely defined. Low birth weight (LBW) is below 2500 g. Very low birth weight (VLBW) is below 1500 g. Ultra-low birth weight (ULBW) is below 1000 g. Large for gestational age (LGA) is birth weight above the 90th percentile. Average weight for gestational age (AGA) (syn: appropriate or adequate) is birth weight between the 10th and 90th per- centiles. Small for gestational age (SGA) (syn: small for dates) is birth weight below the 10th percentile.


  1. A method of reasoning or studying a problem in which the methods and procedures are not described, explained, or perhaps even understood. Nothing is stated or inferred about the method; discussion and conclusions relate solely to the empirical relationships observed.
  2. A method of formally relating an input (e.g., quantity of a drug administered, exposure to a putative causal factor) to an output or an observed effect (e.g., amount of the drug eliminated, disease), without making detailed assumptions about the mechanisms that have contributed to the transformation of input to output within the organism (the “black box”).

“BLACK-BOX EPIDEMIOLOGY” A common epidemiological approach, used both in research and in public health practice, in which the focus is on assessing putative causes and clinical effects (beneficial or adverse) rather than the underlying biological mecha- nisms. It is not a formal branch or specialty of epidemiology, nor is it an epidemiological method or philosophy. Loosely speaking, it is an opposite of mechanistic epidemiology.

BLIND(ED) STUDY (Syn: masked study) A study in which observer(s) and/or subjects are kept ignorant of the group to which the subjects are assigned, as in an experiment, or of the population from which the subjects come, as in a nonexperimental study. When both observer and subjects are kept ignorant, we refer to a double-blind trial or study. If the statistical analysis is also done in ignorance of the group to which subjects belong, the study is sometimes described as triple-blind. The intent of keeping subjects and/ or investigators blinded (i.e., unaware of knowledge that might introduce a bias) is to eliminate the effects of such biases. To avoid confusion about the meaning of the word blind, some authors prefer to describe such studies as masked. See also allocation concealment; performance bias.

BLOCKED RANDOMIZATION (Syn: restricted randomisation) A procedure used in a randomized controlled trial that helps achieve a similar number of subjects allo- cated to each group, often within defined baseline categories. For example, for alloca- tion in two groups (A and B) in blocks of four, there are six variants: (1) A A B B; (2)A BA B;(3)A B BA;(4) B BAA;(5) BA BA;(6) BAA B.To create the allocation sequence, such blocks are used at random. As a result of this procedure, the number of subjects in two groups at any time differs by no more than half the block length. Block size is usually a multiple of the number of groups in the trial. Small blocks are used at the beginning of the trial to balance subjects in small participating clinics or centers. Large blocks control balance less well but mask the allocation sequence better. It may be seen as an analogue in a randomized controlled trial of individual matching in an observational study. See also random allocation; stratified randomization.

BLOT, WESTERN, NORTHERN, SOUTHERN Varieties of tests using electrophoresis, nucleic acid base pairing, and/or protein antibody interaction to detect and identify DNA or RNA samples. The Southern blot, named for its discoverer, E. Southern, is used to identify a specific segment of DNA in a sample. Molecular biologists named varia- tions of the test for the points of the compass. The Northern blot detects and identifies samples of RNA. The Western blot is widely used in a test for HIV infection.

BODY BURDEN Total amount of a substance present in the body.

BODY MASS INDEX (BMI) (Syn: Quetelet’s index) Anthropometric measure, defined as weight in kilograms divided by the square of height in meters. This measure, sug- gested by the Belgian scientist Lambert Adolphe Jacques Quetelet (1796–1857) and then known as Quetelet’s index II, correlates closely with body density and skinfold thickness; in this respect it is superior to the ponderal index. It is a standard measure for the purpose of detecting overweight and obesity.

BONFERRONI CORRECTION See multiple comparison techniques.

BOOKMARKING In genetics and epigenetics, a biological phenomenon believed to function as an epigenetic mechanism for transmitting cellular memory of the pattern of gene expression in a cell, throughout mitosis, to its daughter cells. It is vital for maintaining the phenotype in a lineage of cells. See also epigenetic inheritance.

BOOTSTRAP (Syn: resampling) A technique for estimating the variance and the bias of an estimator by repeatedly drawing random samples with replacement from the obser- vations at hand. One applies the estimator to each sample drawn, thus obtaining a set of estimates. The observed variance of this set is the bootstrap estimate of variance. The difference between the average of the set of estimates and the original estimate is the bootstrap estimate of bias. Many more sophisticated uses of the repeated samples have been developed.12

BOX-AND-WHISKERS PLOT (Syn: box plot, cat-and-whiskers plot) A graphical method of presenting the distribution of a variable measured on a numerical scale. The mid- point (or sometimes, the median) of the distribution is often represented by a horizon- tal line; the values above and below this line divided into quartiles by horizontal lines (the “hinges” of the box) are the two quartiles nearer the midpoint; values beyond the hinges are represented by lines (the “whiskers”) extending to the extreme value in each direction. Both the box-and-whiskers plot and the stem-and-leaf display were developed by the statistician John Tukey.44

“BRAIN SPARING” A human baby receiving an inadequate supply of nutrients or oxy- gen may protect its brain. One way in which it does this is by diverting more blood to the brain at the expense of the blood supply to the trunk. The growth of organs such as the liver is therefore ‘‘traded off’’ to protect growth of the brain. Brain sparing may also be effected through metabolic processes such as insulin resistance.45 See also develop- mental origins hypothesis; plasticity; thrifty phenotype.

BREAKPOINT In helminth epidemiology, the critical mean wormload in a community below which the helminth mating frequency is too low to maintain reproduction. A value exceeding the breakpoint of a wormload means that the wormload will increase until equilibrium is reached; a value less than or equal to the breakpoint means that the wormload will decrease progressively.

BUBBLE PLOT Display of three sets of variables, two of which form a scatter diagram while the third is represented by circles of varying diameter.

BURDEN OF DISEASE The impact of disease in a population. An approach to the analy- sis of health problems, including loss of healthy years of life. It is an important con- cept for public health and for other professions interested in the societal impact of ill-health, including injuries and disabilities. It may be expressed as lost Healthy Life Years (HeaLYs), Disability-Adjusted Life Years (DALYs), or Quality-Adjusted Life Years (QALYs). Use of indicators that integrate the societal burden caused by both death and morbidity allows for the comparison of the burden due to various risk factors or diseases. Sophisticated methodologies used in global burden of disease studies enable the combined measurement of mortality and non-fatal health outcomes, and provide comparable and comprehensive measures of population health across countries. They are also relevant to investigate the costs, efficacy, effectiveness, and other impacts of major health interventions applied in diverse settings. 30, 46,47 The World Health Organization offers practical guidance for the estimation of disease burden at national or local levels for selected environmental and occupational risk factors.48 These methodologies are somewhat controversial, however, because few diseases can be completely eradicated with most known interventions; it is therefore argued that disease burden provides a limited basis for policy, and that planning should be based on other considerations (e.g., the likely impact of interventions instead) and values. See also attributable fraction.