Worked Example

Smoking and Lung Cancer (Doll & Hill, 1950)

Reviewed by the crosstabs.com methods team · Last updated

In this table, patient group is significantly associated with smoking status — a small association: χ²(1, N = 1298) = 22.04, p < .001, V = .13.

The data

Patient group \ Smoking statusSmokerNon-smokerTotal
Lung cancer cases6472649
Controls62227649
Total1,269291,298

Background

In 1950, Richard Doll and Austin Bradford Hill published one of the founding studies of modern epidemiology. Puzzled by the steep rise in lung-cancer deaths in Britain, they interviewed patients in twenty London hospitals: people admitted with lung cancer (the cases) and patients of the same age and sex admitted for other reasons (the controls).

This table shows their male patients: 649 lung-cancer cases and 649 controls, classified as smokers or non-smokers. Only 2 of the 649 cases were non-smokers, against 27 of the controls. The question the crosstab answers is whether smoking status differs between cases and controls — the signature of an association between smoking and lung cancer.

The study design matters as much as the numbers. Because Doll and Hill sampled by disease status rather than by exposure, the row totals are fixed by design and disease prevalence cannot be estimated from the table — which is why the odds ratio, not the relative risk, is the correct effect measure here.

Results

Chi-square test

χ² = 22.04

df = 1, p < .001

Effect size

Cramér's V = 0.130

a small association

Fisher's exact test

p < .001

two-sided, exact for this 2×2 table

Odds ratio

OR = 14.04

95% CI [3.33, 59.30]

APA-style report: χ²(1, N = 1298) = 22.04, p < .001, V = .13. N = 1,298.

Interpretation

The chi-square test rejects independence at the conventional 0.05 level (p < .001): a pattern this strong is unlikely if patient group and smoking statuswere unrelated. Cramér's V of 0.130 puts this in the small range — the association is real but modest — knowing one variable tells you only a little about the other.

Because this is a 2×2 table, Fisher's exact test (p < .001) provides an exact significance check, and the odds ratio of 14.04 (95% CI [3.33, 59.30]) summarizes the strength of the relationship in odds terms.

Caveats

  • This is a retrospective case-control study: participants were sampled by disease status, so the table cannot estimate disease risk in smokers versus non-smokers. Report the odds ratio, not the relative risk. (When the disease is rare, the odds ratio approximates the relative risk.)
  • One cell holds only 2 observations, so the chi-square approximation is strained — Fisher's exact test is the safer significance test for this table.

Try it yourself

Open this table in the calculator

The link pre-fills every cell, label, and variable name — edit the counts and watch the statistics update.

Open in the chi-square calculator →

Run this on your own data — free, no signup

Upload a CSV or XLSX. Everything runs in your browser; your file never leaves your device.

Open the workspace →

References

Related calculators