MR designThis study was conducted according to the STROBE-MR guidelines11,12. MR was designed based on three assumptions: (1) Relevance assumption: genetic variants are significantly associated with metabolism-related lifestyle and clinical risk factors; (2) Independence assumption: genetic variants are independent of other confounding factors for DSCs; (3) Exclusion restriction assumption: genetic variants affect DSCs only through metabolism-related lifestyle and clinical risk factors13 (Fig. 1). Due to the wide variety of metabolism-related lifestyle and clinical risk factors, we have categorized the 19 risk factors into 4 distinct groups to aid in clarity and comprehension. Overall, 19 predominant metabolism-related lifestyle and clinical risk factors were selected and divided into the following four groups: lifestyle factors, physical factors, serum parameters, and metabolic comorbidities. The associations between these metabolism-related lifestyle and clinical risk factors and the development of DSCs (including EC, GC, CRC, HCC, BTC, and PC) were explored preliminarily.Figure 1Flowchart of the data collection, processing, and analysis procedures of this study.Selection of genetic variantsIn MR analysis, we used IVs to investigate the potential associations between metabolism-related lifestyle and clinical risk factors and the development of DSCs. The risk fators/traits were classified into four categories: (1) lifestyle factors, namely, ever/never drinking alcohol, sweet taste, and coffee consumption; (2) physical factors, namely, body mass index (BMI), waist circumference (adjusted by BMI), waist-to-hip ratio (adjusted by BMI), and educational level; (3) serum parameters, namely, high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), total cholesterol (TC), triglyceride (TG), glycine, uric acid, and creatinine levels; and (4) metabolic comorbidities, namely, type 2 diabetes (T2D), T2D (adjusted by BMI), MetS, gout, and Graves’ disease.To obtain GWAS summary data for these traits, we collected information on the IVs from various sources on populations of east Asian ancestry (Supplementary Table 1). The IVs for BMI, uric acid, creatinine, gout, and Graves’ disease were retrieved from Biobank Japan (BBJ) release of disease traits through the Integrative Epidemiology Unit (IEU) Open GWAS project14, and the IVs for HDL-C, LDL-C, TC, TG, and T2D15 were retrieved from the Asian Genetic Epidemiology Network (https://blog.nus.edu.sg/agen/). Additionally, the IVs for ever/never drinking alcohol and coffee consumption were extracted from the study of Matoba et al.16, the IVs for sweet taste were extracted from the study of Kawafune et al.17, the IVs for waist circumference and waist-to-hip ratio were extracted from the study of Wen et al.18, the IVs for educational level were extracted from the pan-ancestry genetic analysis of UK Biobank performed at the Broad Institute (https://gwas.mrcieu.ac.uk/datasets/ukb-e-845_EAS/). The dataset (ukb-e-845_EAS) was collected within participants of east Asian ancestry. The IV for glycine was extracted from the study of Chang et al.19, the IVs for MetS were extracted from the study of Zhu et al.20, and the IVs for gout were extracted from the study of Nakayama et al.21.To ensure the quality of the instrumental single nucleotide polymorphisms (SNPs) of the risk factors, a series of quality control measures were established. First, we identified SNPs were significantly associated with risk factors at the traditional threshold (P < 5 × 10–8). Second, we removed SNPs with a minor allele frequency (MAF) of < 0.01. Third, we retained only SNPs with a long physical distance (window size = 10,000 kb) and a low likelihood of linkage disequilibrium estimates (r2 < 0.001) from East Asian samples in the 1000 Genomes Project. Fourth, the proportion of variance explained by the SNPs (PVE) was calculated according to the following formula: PVE = 2 × MAF × (1-MAF) × beta222. The strength of each SNP was measured by calculating the F-statistic using the following formula: F = PVE × (N− 2)/(1− PVE) 22. When MAFs were not available in the original studies, another formula was used for calculation (F = beta2/se2)23. A statistical power of F > 10 was considered to indicate a strong association13.Finally, we separately analyzed 19 eligible risk factors. An overview of these genetic tools is provided in Supplementary Table 1. We also excluded any candidate genetic tools that failed in the MR pleiotropy residual sum and outlier (MR-PRESSO) test (P < 0.05)24. Ethical approval was obtained in the original studies.GWAS summary statistics of DSCsTo ensure comparability in patients’ ancestry, GWAS summary statistics of the associations between genetic variants and the development of EC (1300 cases and 195,745 controls), GC (6563 cases and 195,745 controls), CRC (7062 cases and 195,745 controls), HCC (1866 cases and 195,745 controls), BTC (339 cases and 195,745 controls), and PC (442 cases and 195,745 controls) were retrieved from BBJ, a biobank containing the data of ~ 200,000 participants recruited mainly from 12 medical institutions in Japan during 2003–200814,25. In the IEU open GWAS platform, the GWAS ID corresponding to EC, GC, CRC, HCC, BTC, and PC was “bbj-a-117,” “bbj-a-119,” “bbj-a-107,” “bbj-a-158,” “bbj-a-92,” and “bbj-a-140,” respectively. During the process of extracting SNPs linked to the exposure from the outcome, any SNPs lacking pertinent details in the outcome were excluded26.Participant overlap in MR analysisAs participant overlap in MR analysis can lead to inflated type I errors27, we attempted to select IVs from sources other than BBJ to avoid the introduction of bias due to sample overlap. However, we identified genetic variants in certain risk factors, such as BMI, uric acid, creatinine, and Graves’ disease only from BBJ. Potential bias introduced by participant sample overlap was calculated using the following formula: bias = βr/F, where β is the MR estimate, r is the sample overlap rate between the exposure and the outcome, and F is the mean F statistic averaged across IVs27.Statistical analysisWe prepared a flowchart to perform MR in a step-by-step manner. First, we harmonized the GWAS data of risk factors and DSCs with the selected IVs being a matching index. Second, we used the MR-PRESSO approach to detect pleiotropic outliers among the selected IVs and removed them before MR analysis. Third, we performed MR-Egger regression to test for horizontal pleiotropy, where P > 0.05 indicated no evidence of horizontal pleiotropy28. In addition, this study aimed to confirm the association between the IVs incorporated in the analysis and the significant risk factors for cancer such as smoking and a family history of all malignant neoplasms. To accomplish this confirmation, we employed an online tool known as PhenoScanner V229. If any IVs were found to be associated with either smoking or a family history of malignant neoplasms, we eliminated them and repeated the MR analysis.After excluding pleiotropy, we used Cochran’s Q test to detect any heterogeneity between SNPs, and various MR methods were used to ensure consistency in the directions (i.e., MR-Egger regression, weighted median, inverse variance weighted [IVW; random-effects models], and weighted mode [WM]). Scatter plots were constructed to visualize the results. For sweet taste, waist-to-hip ratio, glycine, and MetS, only the Wald ratio (WR) method was used (two or fewer SNPs). We calculated the odds ratios (ORs) and the corresponding 95% confidence intervals (CIs) of DSCs per one-standard deviation (SD) increment of a quantitative exposure or per unit change on the log odds scale of binary exposure24. Finally, the leave-one-out sensitivity test was used to evaluate the robustness of the IVW estimates and detect the potential influential SNPs. The statistical power of MR was estimated using mRnd30.We applied the Bonferroni-corrected significance level of P < 2.63 × 10–3 (0.05/19), which indicated a strong association, and a P value between 2.63 × 10–3 and 0.05 indicated a suggestive association. All analyses were conducted using the TwoSampleMR function of the R package (version 4.0.3).Ethics approval and consent to participateOur analyses were based on publicly available data that have been approved by relevant review boards and no additional ethical approval and consent to participate is required.



Source link