Heterogeneity of Colorectal Cancer Risk Factors by Anatomical Subsite in 10 European Countries: A Multinational Cohort Study

Colorectal cancer located at different anatomical subsites may have distinct etiologies and risk factors. Previous studies that have examined this hypothesis have yielded inconsistent results, possibly because most studies have been of insuf ﬁ cient size to identify heterogeneous associations with precision. In the European Prospective Investigation into Cancer and study, we used multivariable joint Cox proportional hazards models, which accounted for tumors at different anatomical sites (proximal colon, distal colon, and rectum) as competing risks, to examine the relationships between 14 established/suspected lifestyle, anthropometric, and reproductive/menstrual risk factors with colorectal cancer risk. Heterogeneity across sites was tested using Wald tests. [ .008) and waist circumference ( P heterogeneity [ .03), with weaker positive associations found for rectal cancer, compared with proximal and distal colon cancer. Current smoking was associated with a greater risk of rectal and proximal colon cancer, but not distal colon cancer ( P heterogeneity [ .05). No heterogeneity by anatomical site was found for alcohol consumption, diabetes, nonsteroidal anti-in ﬂ ammatory drug use, and reproductive/menstrual factors. CONCLUSIONS: The relationships between physical activity, anthropometry, and smoking with colorectal cancer risk differed by subsite, supporting the hypothesis that tumors in different anatomical regions may have distinct etiologies.

C olorectal cancer (CRC) is one of the most frequently occurring malignancies worldwide. In 2018, 1.8 million colorectal cancer diagnoses and 881,000 deaths are estimated to occur. 1 Colorectal tumors at different anatomic sites have variable clinical characteristics. 2 In the proximal colon, tumors typically present at a later stage with a poorer prognosis than those in the distal colon and rectum. 3,4 Women are more likely to develop cancers in the proximal colon, whereas in men cancers are more common in the distal colon region. 5 In addition, with advancing age, a greater proportion of colorectal tumors are located in the proximal colon, with a reduced proportion of rectal tumors. 6 Molecular heterogeneity also has been found for CRC tumors across anatomic sites. CpG island methylator phenotype-high, microsatellite instability-high, and PIK3CA and BRAF mutations are found most commonly in the proximal colon region, with a linear decrease in frequency across the distal colon and rectum regions. 7 KRAS mutations have been found to be most common in the cecum region of the proximal colon, compared with other bowel regions. 7 TP53 mutations are more frequent in tumors in the distal colon and rectum, compared with the proximal colon. 8,9 CRC tumors at different anatomic locations also may have differential etiologies and risk factors. 6,8,10,11 Previous studies that have examined this hypothesis have yielded inconsistent results, possibly because most have been of insufficient size to identify heterogeneous associations with precision. We therefore performed a comprehensive investigation of how 14 established or suspected lifestyle, anthropometric, and reproductive and menstrual risk factors are associated with tumors located at the 3 main anatomic sites (proximal colon, distal colon, and rectum) in the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort, with more than 520,000 participants. The large number of incident CRC cases (>6200) affords high statistical power to compare risk factor associations across tumor anatomic sites.

Study Population
EPIC is a multicenter prospective cohort of 521,448 participants, most were age 35 years and older, who were recruited between 1992 and 2000, predominantly from the general population of 10 European countries (Denmark, France, Germany, Greece, Italy, The Netherlands, Norway, Spain, Sweden, and the United Kingdom). 12 Written informed consent was provided by all study participants, and ethical approval for EPIC was provided by the International Agency for Research on Cancer and local participating centers. Participants with cancer diagnoses before recruitment (n ¼ 29,456); those in the highest and lowest 1% of the distribution for the ratio of energy intake to estimated energy requirement (n ¼ 9573); and those with missing information on alcohol consumption and follow-up evaluation (n ¼ 6259) were excluded from analyses. Additional exposure-specific exclusions were applied when there was missing information for the risk factor of interest.

Follow-Up Evaluation for Cancer Incidence and Vital Status
Cancer incidence was determined through record linkage with regional cancer registries or via a combination of methods, including the use of health insurance records, contacts with cancer and pathology registries, and active follow-up evaluation. CRC cases were defined using the 10th Revision of the International Classification of Diseases and the 2nd Revision of the International Classification of Diseases for Oncology. Proximal colon cancer included those tumors within the cecum, appendix, ascending colon, hepatic flexure, transverse colon, and splenic flexure (C18.0-18.5). Distal colon cancer included those within the descending (C18.6) and sigmoid (C18.7) colon. Cancer of the rectum included cancer occurring at the rectosigmoid junction (C19) and rectum (C20).

Statistical Analysis
Hazard ratios (HRs) and the corresponding 95% CIs for the 14 risk factors and CRC were estimated using Cox proportional hazards models. Age was used as the time-scale in all models. Time at entry was age at recruitment. Exit time was age at whichever of the following came first: CRC diagnosis, death, or the last date at which follow-up evaluation was considered complete in each center. For the analyses by anatomic site, HRs and 95% CIs were estimated using a multivariable joint Cox proportional hazards model, which accounted for tumors located at different anatomic sites as competing risks. 13 Heterogeneity across sites was tested using Wald tests. Full details on the statistical methods are shown in the Supplementary Methods section and are detailed by Xue et al. 13 Separate models were run for body size measurements and CRC for men and women because of a priori knowledge that the relationship differs by sex. 14 To determine whether the lifestyle risk factors and CRC relationships differed by sex, we included an interaction term for sex (multiplicative scale) in the model. The statistical significance of the cross-product term was evaluated using the likelihood ratio test. Because no heterogeneity was found by sex for smoking status (P interaction ¼ .36), physical activity (P interaction ¼ .71), alcohol consumption (P interaction ¼ .45), diabetes (P interaction ¼ .83), or

What You Need to Know
Background Previous research has indicated that colorectal tumors located at different anatomic sites have distinct clinical and molecular characteristics. It also has been hypothesized that colorectal cancer at different anatomic locations may have differential etiologies and risk factors. Previous epidemiologic studies may have been underpowered to detect heterogeneous relationships by anatomic site.

Findings
This was a large study that was performed to comprehensively investigate the relationships between colorectal cancer risk factors by anatomic site in both men and women, with more than 520,000 participants from 10 European countries included, and more than 6200 incident colorectal cancer cases. We found heterogeneous relationships across tumors located in the proximal colon, distal colon, and rectum for physical activity levels, anthropometric measurements, and smoking.

Implications for patient care
These results highlight the importance of separating the colorectum into distinct entities with separate etiologies. Variability in the carcinogenic processes at different sites of the large bowel may explain the complex risk factor-colorectal cancer relationships. NSAID use (P interaction ¼ .34), men and women were analyzed together. Multivariable models were, where appropriate, mutually adjusted. We also conducted sensitivity analyses separating tumors located in the cecum (C18) into an additional anatomic site and examining heterogeneity in the relationships to each risk factor across 4 anatomic sites (cecum colon vs proximal colon vs distal colon vs rectum). Statistical tests used in the analysis all were 2-sided and a P value less than .05 was considered statistically significant.

Results
During a median follow-up period of 14.9 years, 6291 CRC cases occurred (2718 in men and 3573 in women). Of these, 1877 were located in the proximal colon, 1743 in the distal colon, and 2094 in the rectum. Table 1 shows the characteristics of participants included in the analysis.
Alcohol consumption, prevalent diabetes, and smoking were associated with a greater risk of CRC, and ever NSAID use and physical activity were associated with a lower risk (Figure 1). For physical activity, compared with being inactive, the physically active group had a lower risk of developing CRC (HR, 0.90; 95% CI, 0.82-0.98; P trend ¼ .01). This inverse association was most evident for proximal colon cancers (HR, 0.74; 95% CI, 0.63-0.87; P trend ¼ .0004), although the estimates were not statistically significant for distal colon or rectal cancers (P heterogeneity for proximal-distal-rectal ¼ .03). Smoking was associated with the development of CRC (current smokers vs never smokers: HR, 1.19; 95% CI, 1.11-1.28; P trend < .0001). By anatomic site, For men and women, higher BMI, height, waist circumference, and waist-to-hip ratio all were associated with a greater risk of CRC ( Figure 2). For men, the positive relationship for BMI was weaker for rectal cancer (per 5 kg/m 2 : HR, 1.10; 95% CI, 1.01-1.20), compared with proximal colon cancer (per 5 kg/m 2 : HR, 1.31; 95% CI, 1.18-1.47) and distal colon cancer (per 5 kg/m 2 : HR, 1.32; 95% CI, 1.20-1.45) (P heterogeneity ¼ .008), but no heterogeneity was found between tumors in the proximal and distal colon (P heterogeneity ¼ .94). In addition, in men, the positive waist circumference association was weaker for tumors located in the rectum (per 5 cm: HR, 1.06; 95% CI, 1.03-1.09), than for tumors in the proximal colon (per 5 cm: HR, 1.11; 95% CI, 1.07-1.16) and distal colon (per 5 cm: HR, 1.12; 95% CI, 1.08-1.16) (P heterogeneity ¼ .03), but no heterogeneity was found across the colon (proximal vs distal P heterogeneity ¼ .78). The positive association between the waist-to-hip ratio and CRC for men and women was consistent across all anatomic sites (all P heterogeneity > .60). For men and women, height was not associated with rectal cancer (per 10 cm in men: HR, 0.97; 95% CI, 0.88-1.06; per 10 cm in women: HR, 0.92; 95% CI, 0.83-1.03), but was related positively to both proximal colon and distal colon cancers (P heterogeneity ¼ .0001 for men and P heterogeneity < .0001 for women). The association of height with colon cancer did not differ between the proximal and distal colon in men (P heterogeneity ¼ .24), but there was some suggestion of heterogeneity Multivariable-adjusted HRs and 95% CIs for colorectal cancer incidence for both sexes combined in relation to lifestyle factors, by anatomic site. For alcohol consumption, physical activity, and smoking status: multivariable models-Cox regression using age as the underlying time variable and stratified by sex, center, and age at recruitment. Models mutually adjusted, and additionally adjusted for body mass index, height, education level, ever use of menopausal hormone therapy, and intakes of alcohol, red and processed meats, calcium, and fiber. For ever NSAID use and prevalent diabetes: multivariable models-Cox regression using age as the underlying time variable and stratified by sex, center, and age at recruitment adjusted for body mass index, height, physical activity; smoking status and intensity; education level; ever use of menopausal hormone therapy; and intakes of alcohol, red and processed meats, calcium, and fiber. Information on NSAID use was available from only 6 centers: Cambridge, Utrecht, Heidelberg, Potsdam, Aarhus, and Copenhagen. Prox-dist-rect, proximal, distal, rectal.
for women (P heterogeneity ¼ .05), with a stronger positive association observed for proximal colon cancer (per 10 cm: HR, 1.30; 95% CI, 1.17-1.43) than for distal colon cancer (per 10 cm: HR, 1.11; 95% CI, 0.99-1.25). For women, no heterogeneity by subsite was observed for the other anthropometric measurements, with similar strength associations found for BMI, waist circumference, and waist-to-hip ratio across tumors at the 3 anatomic sites (all P heterogeneities > .05).
Ever MHT use vs never use was associated with a lower risk of CRC (HR, 0.90; 95% CI, 0.83-0.97), with no evidence of heterogeneity across subsites (P heterogeneity > .16) (Figure 3). The duration of MHT use was associated inversely with CRC risk (P trend ¼ .01), with no heterogeneity found by anatomic site (P heterogeneity > .05). Age at menarche and ever OC use was not associated with CRC and no heterogeneity was observed across anatomic sites (P heterogeneity > .05). Older age (55 y) vs younger age at menopause (50 y) was associated with increased CRC risk (HR, 1.20; 95% CI, 1.03-1.38), with similar relationships observed by anatomic site (P heterogeneity > .40).
When tumors located in the cecum were considered as an additional subsite end point, a similar pattern of heterogeneous relationships was considered across the 4 subsites (cecum colon, proximal colon, distal colon, and rectum) (Supplementary Tables 2-4).

Discussion
In this multicountry prospective study, we found heterogeneous relationships by tumor site for physical activity, smoking, and anthropometric measurements. Low levels of physical activity and greater height and BMI were associated primarily with an increased risk of distal or proximal colon cancer, with weaker or null relationships found for rectal cancer. Current smoking was associated with an increased risk of proximal colon and rectal cancer, whereas no heterogeneity by anatomic site was found for alcohol consumption, prevalent diabetes, NSAID use, and, in women, reproductive and menstrual factors.
For overall CRC, we observed the expected pattern of risk factor associations. Greater adiposity and height were associated with increased CRC risk, as were higher alcohol consumption, smoking, prevalent diabetes, and later age at menopause. Conversely, being physically active and use of NSAIDs and MHT were associated with a lower risk of developing CRC. Our analysis benefited from the large number of incident CRC cases that accrued during the longer follow-up period, which allowed wellpowered analyses for the 14 risk factors by tumor anatomic site. Recently, a similar analysis of CRC risk factors by anatomic site was performed in a large UK cohort, with no heterogeneity found for the considered risk factors by tumor anatomic site 15 ; however, that study included only women, so it is uncertain whether the findings are generalizable to men. 15 Previous studies that have investigated heterogeneity in the association between major risk factors and colorectal anatomic subsites in men and women had smaller numbers of cases compared with our analysis, and may have been constrained by insufficient statistical power to identify weak-to-moderate strength heterogeneous associations. 16,17 In the current study, which included men and Figure 2. Multivariable-adjusted HRs and 95% CIs for colorectal cancer incidence for both sexes combined in relation to anthropometric measures, by anatomic site. Multivariable models only-Cox regression using age as the underlying time variable and stratified by center and age at recruitment, and adjusted for physical activity, smoking status and intensity, education level, ever use of menopausal hormone therapy, and intakes of alcohol, red and processed meats, calcium, and fiber. Multivariable model for height was adjusted further for body mass index. Multivariable models for body mass index, waist circumference, and waist-to-hip ratio were adjusted further for height. Prox-dist-rect, proximal, distal, rectal.
women, we observed heterogeneous relationships between several risk factors and tumors across different anatomic sites.
We found that greater physical activity was related similarly to lower risks of developing tumors in the proximal and distal colon regions, findings consistent with other large prospective studies, 15,17 and a metaanalysis of 21 studies. 18 Physical activity, however, was not related to rectal cancer risk, a result inconsistent with a recent participant-level pooled analysis that reported an inverse relationship between physical activity and rectal cancer incidence, 19 but in accordance with a joint Nurses' Health Study and Health Professionals Follow-up Study analysis. 10 The biological mechanisms through which physical activity potentially decreases colon cancer risk, but not rectal cancer risk, are uncertain. Being physically active is associated with less weight gain and body fatness, 20 and therefore has a beneficial effect on CRC risk. 21 However, in our study, we found that greater BMI and waist circumference were risk factors for colon and, albeit more weakly, for rectal cancer. Greater physical activity also has been associated with lower insulin levels and beneficial effects on inflammatory pathways and dyslipidemia, including decreasing levels of circulating triglycerides. [22][23][24] Previous meta-analyses have suggested that C-peptide (a marker of insulin secretion), C-reactive protein (a nonspecific marker of systemic inflammation), and triglycerides are associated positively with colon, but not rectal, cancer. [25][26][27][28] This suggests that any beneficial effects of physical exercise on insulin (or correlated metabolic markers), inflammatory, and lipid pathways would be more likely to influence tumors in the colon, and not in the rectum, potentially explaining the null result we observed for physical activity with rectal cancer.
Our finding that higher BMI was related more strongly to greater CRC risk among men than among women is in accordance with a large body of epidemiologic evidence. 21,29,30 We observed heterogeneous relationships for anthropometric measurements by anatomic site, particularly for men. For BMI, the positive relationship found among men was weaker for rectal cancer compared with tumors in the colon. A metaanalysis of prospective studies also observed that, for men, a greater BMI was associated more weakly with rectal cancer (relative risk per 5-kg/m 2 unit increase in BMI, 1.12; 95% CI, 1.09-1.16) than with colon cancer (relative risk per 5-kg/m 2 unit increase in BMI, 1.30; 95% CI, 1.25-1.35). 21 A moderately weaker positive relationship was found for waist circumference and rectal cancer in men compared with colonic subsites, however, for waist-to-hip ratio no heterogeneity by anatomic site was observed. For men and women, height was associated with colon cancer, but not with rectal cancer. This null result for rectal cancer is inconsistent with other large prospective cohort studies and a meta-analysis that found a positive association for height and rectal cancer. 31,32 In addition, positive relationships of similar magnitude were found for both colon and rectal cancer in a Mendelian randomization analysis. 33 Figure 3. Multivariable-adjusted HRs and 95% CIs for colorectal cancer incidence in relation to reproductive and menstrual factors among women, by anatomic site. Multivariable models only-Cox regression using age as the underlying time variable and stratified by center and age at recruitment, and adjusted for body mass index, height, physical activity, smoking status and intensity, education level, ever use of menopausal hormone therapy, and intakes of alcohol, red and processed meats, calcium, and fiber. Prox-dist-rect, proximal, distal, rectal.
Current smoking was related to an increased risk of proximal colon and rectal cancers, but not distal colon cancer. A similar pattern of results for smoking history was found in the Nurses' Health Study, with 40 packyears of smoking (vs none) being associated positively only with proximal colon (HR, 1.31; 95% CI, 1.16-1.48) and rectal cancer (HR, 1.27; 95% CI, 1.05-1.53), but not distal colon cancer (HR, 1.04; 95% CI, 0.88-1.23). 17 Microsatellite instability-high, BRAF mutation-positive, and CpG island methylator phenotype-positive tumors, are more common in the proximal colon region compared with the distal colon, 7 and have been associated positively with cigarette smoking. 11 However, these molecular characteristics are even less common for malignant tumors in the rectum, the subsite for which we observed the strongest positive relationship with smoking. In addition, a positive relationship was observed for former smokers and distal colon cancer, which is inconsistent with these molecular characteristics explaining these findings.
The current investigation was a large study that comprehensively investigated the relationships between CRC risk factor by anatomic site in both men and women. Limitations of our analysis were that all of the considered risk factors were measured once at baseline, and because of multiple known or suspected CRC risk factors being investigated simultaneously, some of our results could have been chance findings. Finally, our study would have been enhanced with information on tumor molecular features.
In conclusion, heterogeneous relationships across tumors located in the proximal colon, distal colon, and rectum were observed for physical activity, anthropometric measurements, and smoking. These results, taken together with the varying biological and molecular features of tumors located across the colorectum, indicate that tumors in different anatomic regions may have distinct etiologies.
With participants not wearing shoes, weight was measured to the nearest 0.1 kg and height was measured-dependent on the study center-to the nearest 0.1, 0.5, or 1.0 cm. BMI was calculated as weight in kilograms divided by height in meters squared. Waist circumference was measured either at the narrowest torso circumference or at the midpoint between the lower ribs and iliac crest. Hip circumference was measured at the widest circumference (France; Italy; Spain; Bilthoven, The Netherlands; Greece; Malmö, Sweden) or over the buttocks (the United Kingdom; Utrecht, The Netherlands; Germany; Denmark). The waist-to-hip ratio was calculated by dividing the waist circumference by the hip circumference. Standardized lifestyle and personal history questionnaires were collected at recruitment, 1,2 before disease onset or diagnosis. Information on cigarette smoking habits included baseline smoking status (never, former, or current smoker). Overall physical activity (the sum/ total of occupational physical activity and leisure time physical activity) was assessed from 3 questions referring to the past year and an index was derived by allocating individuals to 4 categories of overall activity (inactive, moderately inactive, moderately active, and active). 3 Information was collected on education, diabetes prevalence, oral contraceptive use, MHT use, age at menarche, age at menopause, and, in 6 centers (Cambridge, UK; Utrecht, The Netherlands; Heidelberg and Potsdam, Germany; Aarhus and Copenhagen, Denmark), NSAID use (including aspirin). Diet over the previous 12 months was assessed at recruitment using validated country-/center-specific dietary questionnaires. 1,2 Alcohol consumption at recruitment was calculated from the number of standard glasses of beer, wine, cider, sweet liquor, distilled spirits, or fortified wines consumed per day/week reported during the 12 months before recruitment.

Follow-Up Evaluation for Cancer Incidence and Vital Status
Cancer incidence was determined through record linkage with regional cancer registries (Denmark, Italy, The Netherlands, Norway, Spain, Sweden, and the United Kingdom) or via a combination of methods, including the use of health insurance records, contacts with cancer and pathology registries, and active follow-up evaluation through participants and their next of kin (France, Germany, and Greece). Colorectal cancer cases were defined using the 10th Revision of the International Classification of Diseases and the 2nd Revision of the International Classification of Diseases for Oncology. Proximal colon cancer included those within the cecum, appendix, ascending colon, hepatic flexure, transverse colon, and splenic flexure (C18.0-18.5). Distal colon cancer included those within the descending (C18.6) and sigmoid (C18.7) colon. Cancer of the rectum included cancer occurring at the rectosigmoid junction (C19) and rectum (C20).

Statistical Analysis
HRs and the corresponding 95% CIs for the 14 risk factors and CRC were estimated using Cox proportional hazards models. Age was used as the time-scale in all models. Time at entry was age at recruitment. Exit time was age at whichever of the following came first: colorectal cancer diagnosis, death, or the last date at which follow-up evaluation was considered complete in each center. Possible nonproportionality was assessed using an analysis of Schoenfeld 4 residuals, with no evidence of nonproportionality being detected. For the analyses by anatomic site, HRs and 95% CIs were estimated using multivariable joint Cox proportional hazards model, which accounted for tumors located at different anatomic sites as competing risks. 5 The heterogeneity in baseline risk of colorectal cancer subsites was addressed by stratified Cox models, in which each subsite was allowed to have its own baseline hazard function; the heterogeneity in association with risk factors across subsites was assessed by including an interaction term between each risk factor and the indicators of colorectal cancer subsites and testing the statistical significance of the interaction terms. Because a robust variance was used to address the competing risk between colorectal cancer subsites, a log-likelihood ratio test was no longer valid. We therefore used a global Wald test based on the robust variance estimates obtained from a sandwich type of estimator. NOTE. Multivariable models only: Cox regression used age as the underlying time variable and was stratified by center and age at recruitment, and adjusted for physical activity index, smoking status and intensity, education level attained, ever use of menopausal hormone therapy, and intakes of alcohol, red and processed meats, dietary calcium, and fiber. Multivariable model for height was adjusted further for body mass index. Multivariable models for body mass index, waist circumference, and waist-to-hip ratio were adjusted further for height. BMI, body mass index; HRs, hazard ratios.