Journal Pre-proof Development and Validation of a Score for Fibrotic Non-Alcoholic Steatohepatitis

Background and aims: Non-invasive assessment of histological features of non-alcoholic fatty liver disease (NAFLD) has been an intensive research area over the last decade. Herein, we aimed to develop a simple non-invasive score using routine laboratory tests to identify, among individuals at high risk for NAFLD, those with fibrotic non-alcoholic steatohepatitis (NASH) defined as NASH, NAFLD activity score (NAS) ≥4, and fibrosis stage ≥2. Methods: The derivation cohort included 264 morbidly obese individuals undergoing intraoperative liver biopsy in Rome, Italy. The best predictive model was developed and internally validated using a bootstrapping stepwise logistic regression analysis (2000 bootstrap samples). Performance was estimated by the area under the receiver operating characteristic curve (AUROC). External validation was assessed in three independent European cohorts (Finland, n=370; Italy n=947; England n=5,368) of individuals at high risk for NAFLD. Results: The final predictive model, designated as Fibrotic NASH Index (FNI), combined aspartate aminotransferase


Abstract
Background and aims: Non-invasive assessment of histological features of non-alcoholic fatty liver disease (NAFLD) has been an intensive research area over the last decade.Herein, we aimed to develop a simple non-invasive score using routine laboratory tests to identify, among individuals at high risk for NAFLD, those with fibrotic non-alcoholic steatohepatitis (NASH) defined as NASH, NAFLD activity score (NAS) ≥4, and fibrosis stage ≥2.

Methods:
The derivation cohort included 264 morbidly obese individuals undergoing intraoperative liver biopsy in Rome, Italy.The best predictive model was developed and internally validated using a bootstrapping stepwise logistic regression analysis (2000 bootstrap samples).Performance was estimated by the area under the receiver operating characteristic curve (AUROC).External validation was assessed in three independent European cohorts (Finland, n=370; Italy n=947; England n=5,368) of individuals at high risk for NAFLD.

Conclusion:
FNI is an accurate, simple, and affordable non-invasive score which can be used in primary healthcare to screen for fibrotic NASH individuals with dysmetabolism.

Introduction
Following the global burden of obesity and type 2 diabetes, non-alcoholic fatty liver disease (NAFLD) is now the major cause of chronic liver disease worldwide. 1NAFLD encompasses a broad spectrum of conditions, from isolated hepatic fat accumulation to hepatocellular damage and inflammation (non-alcoholic steatohepatitis, NASH), leading to fibrosis and end-stage liver disease, namely cirrhosis and hepatocellular carcinoma. 2,3Obesity and type 2 diabetes are the strongest environmental factors increasing the risk of NAFLD. 4 However, despite the very large number of individuals with NAFLD, only a minority progress to cirrhosis and hepatocellular carcinoma. 1body of evidence shows that individuals with fibrotic NASH, the inflammatory form of NAFLD associated with significant activity and fibrosis, are at risk of developing advanced liver disease.5 The gold standard for diagnosing NASH and liver fibrosis is still a histological assessment by liver biopsy, an invasive and costly procedure which is not devoid of complications.6,7 The identification of individuals with fibrotic NASH in primary healthcare is crucial because these individuals will benefit the most from a referral to liver clinic for further investigation and follow-up.Moreover, these individuals are the ideal candidates for inclusion in NASH clinical trials.8,9 Therefore, due to the large number of individuals with NAFLD and the invasiveness of liver biopsy, non-invasive screening scores for fibrotic NASH are urgently needed.Indeed, existing scores are mainly focused on the assessment of liver fibrosis, the most relevant prognostic factor in NAFLD.10,11 Up to date, three non-invasive scores have been specifically generated to assess fibrotic NASH, namely MACK-3 (hoMa, Ast, CK18), 12 NIS4, 13 and FibroScan-AST (FAST) score.14 However, these scores are based on blood tests available only J o u r n a l P r e -p r o o f in highly specialized liver clinics or require instrumental evaluation by vibration-controlled transient elastography.
In this study, we aimed to develop a simple non-invasive score based on routine laboratory tests to screen for and identify fibrotic NASH in individuals at high risk for NAFLD in primary healthcare.

Derivation cohort
MAFALDA cohort.A total of 264 participants from the "Molecular Architecture of FAtty Liver Disease in individuals with obesity undergoing bAriatric surgery (MAFALDA)" were included in the analyses. 15Briefly, consecutive individuals with morbid obesity eligible for bariatric surgery, without history of alcohol abuse (≥30/20 g/day in men/women), chronic viral hepatitis, and other causes of liver disease, were recruited from May 2020 to June 2021 at Campus Bio-Medico University Hospital, Rome, Italy.Preoperative clinical and laboratory data were collected using standardized procedures.Intraoperative liver biopsy was obtained and scored according to NAS classification. 16NASH was diagnosed with at least grade one for steatosis, ballooning, and lobular inflammation. 17 Helsinki University Hospital, Helsinki, Finland.All participants were 18-75 years old, without history of alcohol abuse (≥30/20 g/day in men/women), chronic viral hepatitis, and other causes of liver disease.A week before liver biopsy, participants underwent clinical examination and blood sampling as previously described. 18Liver biopsies were scored according to NAS classification. 16NASH was diagnosed when steatosis, lobular inflammation, and ballooning each had at least one grade. 19 with FAST score >0.35. 14The study was approved by the Local Research Ethics Committee at the Fondazione IRCCS Ca' Granda.All participants gave written informed consent to the study.

UK Biobank cohort.
The UK Biobank is a large prospective cohort study recruiting approximately 500,000 participants (age 40-69 years) between 2006-2010 throughout the UK. 22The UK Biobank study has been approved by the North West Multicenter Research

J o u r n a l P r e -p r o o f
Ethics Committee (no.11/NW/0274).All participants gave written informed consent to the study.
First, we selected unrelated UK Biobank participants of European ancestry based on our quality control pipeline which has been described in detail previously. 20,23Next, we included in our analyses only individuals with BMI ≥25 kg/m 2 and/or with type 2 diabetes as defined elsewhere. 24en, to assess the performance of our score for fibrotic NASH, we selected 5,368 individuals without chronic viral hepatitis and with liver magnetic resonance imaging (MRI) proton density fat fraction (PDFF) and iron-corrected T1 (cT1) measurements available. 25,26brotic NASH was defined as steatosis by PDFF >5.5%, 25 NASH by cT1 >800 msec, 27 and significant fibrosis by Fibrosis-4 (FIB-4) index ≥1.3. 28nally, to assess the performance of our score for incident severe liver disease (SLD), 24 after excluding participants with MRI data available, we selected 305,745 individuals without liver disease at baseline and estimated those who developed SLD prospectively.Detailed information about the UK Biobank methods is provided in supplementary material.

Statistical analyses
The score was developed based on 264 morbidly obese individuals in the derivation cohort and internally validated using a bootstrapping stepwise logistic regression model (2000 bootstrap samples).A total of 15 predictors were included in the model: age, gender, BMI, waist circumference, glucose, hemoglobin A1c (HbA1c), total cholesterol, HDL cholesterol, triglycerides, aspartate aminotransferase (AST), alanine aminotransferase (ALT), gamma glutamyltransferase (GGT), platelet count, albumin, and total bilirubin.Logarithmic transformation was considered for continuous variables to improve the normality of distribution.Two (0.8%) individuals were removed from the analysis due to missing values.

J o u r n a l P r e -p r o o f
The score was derived based on the final predictors and the corresponding regression coefficients.Performance for fibrotic NASH was assessed by the area under the receiver operating characteristic curve (AUROC) in the derivation and validation cohorts.Rule-out and rule-in cut-offs were derived in the derivation cohort based on sensitivity ≥0.89 and specificity ≥0.90, respectively.Cut-off based on the maximal sum of sensitivity and specificity (Youden index) was also determined.At each cut-off, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were computed together with 95% confidence interval (CI).AUROCs were compared using the DeLong test.Calibration was assessed in the derivation cohort using Hosmer-Lemeshow goodness of fit test and calibration plot.
Performance for incident SLD in the UK Biobank was estimated by AUROC of Cox proportional hazards models.Statistical analyses were performed using the software R, version 4.0.4(R Foundation for Statistical Computing, Vienna, Austria).

Clinical characteristics of derivation and external validation cohorts
Clinical characteristics of derivation and external validation cohorts are shown in Table 1.The two histological cohorts (MAFALDA and Helsinki cohorts) were well matched for age and gender, while the Liver Bible and UK Biobank cohorts had higher mean age and higher rate of men.Biochemical parameters were similar across the cohorts.The Liver Bible cohort had the highest rate of hypertension (74% vs 41-63%), whereas the Helsinki cohort had the highest rate of type 2 diabetes (38% vs 4-16%).Biopsy-proven NASH was diagnosed in 42% individuals of the derivation cohort and in 12% individuals of the Helsinki cohort.Fibrotic NASH was reported in 20% individuals of the derivation cohort and in 2-5% individuals of the external validation cohorts.

Development of a prediction model for fibrotic NASH
Bootstrapping stepwise logistic regression analysis identified three final independent predictors of fibrotic NASH: AST, HDL cholesterol, and HbA1c.Based on the corresponding regression coefficients, the following index−the Fibrotic NASH Index (FNI)−was derived: The FNI is a predicted probability score and ranges from 0 to 1.As an example, an individual with a FNI of 0.10 would have a 10% predicted probability of fibrotic NASH (NASH + NAS≥4 + F≥2).The FNI can be easily calculated on the following website: https://fniscore.github.io/.
In the derivation cohort, the performance of FNI for fibrotic NASH estimated by AUROC was 0.78 (95% CI 0.71-0.85)with satisfactory calibration of predicted probabilities (Figure 1).In the external validation cohorts, AUROCs ranged from 0.80 to 0.95 (Table 2).In the derivation cohort, cut-off for sensitivity ≥0.89 (rule-out zone) was 0.10, with a NPV of 0.93.Cut-off for specificity ≥0.90 (rule-in zone) was 0.33, with a PPV of 0.57 (Table 2).When applying these cut-offs to the external validation cohorts, at the rule-out cut-off of 0.10, sensitivity ranged from 0.87 to 1, with a NPV between 0.99 and 1; at the rule-in cut-off of 0.33, specificity ranged from 0.73 to 0.98, with a PPV between 0.12 and 0.49 (Table 2).
The performance of FNI and FIB-4 for fibrotic NASH was compared in derivation and two external validation cohorts (Figure 2, Table 2).Corresponding AUROCs were higher for FNI in the derivation and Liver Bible cohorts (p=0.001 and 3.08x10 -08 , respectively), whereas no difference was found between the two scores in the Helsinki cohort (p=0.85).

J o u r n a l P r e -p r o o f
During a median (interquartile range) follow-up of 9.0 (8.3-9.7)years, there were 1,054 individuals who developed SLD, including 928 with cirrhosis and/or decompensated liver disease, 126 with hepatocellular carcinoma, and 18 that underwent liver transplantation.Death from SLD occurred in 542 individuals.

Discussion
In this study, we develop and validate the FNI, a novel and simple non-invasive score for detecting fibrotic NASH among individuals at high risk for NAFLD, namely those with overweight/obesity, type 2 diabetes, and metabolic syndrome.Notably, this is the first score tailored for fibrotic NASH based on routine laboratory tests, namely AST, HDL cholesterol, and HbA1c.
We started by examining the MAFALDA, a cross-sectional cohort of morbidly obese individuals in whom the diagnosis of fibrotic NASH was assessed by histology.In MAFALDA, we generated and internally validated a prediction model for fibrotic NASH by using a bootstrapping stepwise regression analysis.We found that AST, HDL cholesterol, and HbA1c were the best independent predictors of this condition.Consistently, elevated AST is a wellknown biomarker of liver fibrosis, 29 whereas HbA1c and HDL cholesterol are both flagging the presence of dysmetabolism, given their correlation with insulin resistance and impaired J o u r n a l P r e -p r o o f glucose tolerance. 30,31In the derivation cohort, this model showed good success in predicting fibrotic NASH with an AUROC of 0.78 (0.71-0.85).
Next, we validated our prediction model in three independent external cohorts comprising individuals with overweight/obesity, type 2 diabetes, and metabolic syndrome.In these cohorts, irrespective of the methodology used to assess fibrotic NASH (liver biopsy, vibration-controlled transient elastography including CAP, or liver MRI), the performance of our score was very good with an AUROC range of 0.80-0.95.Notably, one of the external validation cohorts included more than 5,000 high-risk individuals from the UK Biobank.
Existing non-invasive clinical scores are focused on detecting advanced fibrosis, the most relevant predictor of mortality in NAFLD. 10However, the degree of liver inflammation is a crucial driver of liver damage. 32In this scenario, the presence of NASH with significant activity (NAS≥4) has been identified as an essential condition for enrollment in NAFLD clinical trials. 9This is mainly due to two reasons: 1) the histological response to drug therapy is higher in individuals with an active disease, 33 and 2) the inclusion of individuals with fibrotic NASH is more likely to ensure that the estimated number of clinical events will occur during the study observation period.Along this line, the presence of an active liver disease is expected to be included among the prescribing criteria of new emerging pharmacotherapies once they become available.Within this context, FNI may also be used as a longitudinal biomarker to non-invasively monitor the effectiveness of interventional strategies for NASH.
Very recently, three non-invasive scores have been generated to detect fibrotic NASH: two blood-based, MACK-3 12 (AST, glucose, insulin, cytokeratin 18) and NIS4 13 (miR-34a-5p, alpha-2 macroglobulin, YKL-40, HbA1c), and the transient elastography-based FAST score (AST, CAP, liver stiffness measurement). 14The accuracy of these scores for fibrotic NASH was good and comparable to that of FNI, with AUROCs ranging from 0.80 to 0.85.However, these scores are based on blood/instrumental tests relatively expensive and/or not widely J o u r n a l P r e -p r o o f available in primary care.Consequently, although FibroScan ® is increasingly used worldwide, the screening for fibrotic NASH in large at-risk populations in primary care using these scores appears to be impractical and costly.
Would the FNI score be a viable option to screen for fibrotic NASH in large at-risk populations?Within this context, the risk stratification pathway recently proposed by the European Association for the Study of the Liver (EASL) recommended a FIB-4 cut-off <1.3 to rule out those not needing a referral to the liver specialist. 34In individuals with metabolic risk factors from the general population, a FNI value ≤0.10 (rule-out zone) would exclude the presence of fibrotic NASH with high sensitivity and high NPV.Importantly, in both derivation and external validation cohorts, at least one out five individuals belonged to the rule-out zone, thus avoiding further referral to the liver specialist.Notably, the FNI cut-off of 0.10 had a higher sensitivity for fibrotic NASH as compared to the FIB-4 cut-off of 1.3.Consequently, in the general population with metabolic risk factors, the risk stratification using FNI as opposed to FIB-4 would allow to miss fewer individuals with fibrotic NASH.Importantly, these individuals may require and benefit the most from a prompt intervention in liver clinics due to the presence of an active disease at higher risk of liver-related outcomes.Consistently, we found that, during a median follow-up of 9 years, FNI was more accurate than FIB-4 for predicting incident SLD.However, it is fair to say that FIB-4 has been generated to assess liver fibrosis and the 1.3 cut-off is used to rule out advanced fibrosis rather than progressive NASH. 34nversely, PPV for fibrotic NASH was rather low in the FNI rule-in zone.This is mainly due to the low prevalence of fibrotic NASH in the cohorts used in our study.Indeed, the performance of any disease predictive model is highly dependent on the prevalence of the disease in the referral population. 34Indeed, although FNI was generated and validated in individuals at high risk for NAFLD, the prevalence of fibrotic NASH in these individuals was J o u r n a l P r e -p r o o f relatively low.However, the performance of the FNI rule-in cut-off is expected to be higher in individuals from secondary/tertiary care centers where the prevalence of advanced fibrosis is higher.Further studies are warranted to assess the performance of FNI in these settings.
Collectively, our data support that FNI may be useful for ruling out rather than diagnosing fibrotic NASH in at-risk individuals in primary healthcare and diabetology/endocrinology clinics.Individuals with indeterminate and positive results would deserve referral to liver clinic for further investigations and follow-up.
The present study has several strengths.First, we used a large and well-characterized derivation cohort with liver biopsy data available.Second, we developed for the first time a predictive model for fibrotic NASH based on routine and widely available laboratory tests which are commonly evaluated in individuals with metabolic risk factors.Third, we validated our findings in three independent and large external validation cohorts.Among them, one included more than 5,000 individuals from the UK Biobank.
Our study has also some limitations.First, FNI has been specifically designed and validated in individuals with dysmetabolism and not in those referred for NAFLD in liver secondary/tertiary care settings.Therefore, its performance should be further verified before being used in this context.Second, we could not compare FNI with other non-invasive bloodbased scores for fibrotic NASH, such as MACK-3, because they were not available in most cohorts.
In conclusion, we developed and validated the FNI, an accurate, simple, and affordable non-invasive score for fibrotic NASH based on routine laboratory tests, namely AST, HDL cholesterol, and HbA1c.This score may help clinicians identify at-risk individuals in primary healthcare and diabetology/endocrinology clinics who require a referral to the liver specialist.Abbreviations: ALT, alanine aminotransferase; AST, aspartate aminotransferase; BMI, body mass index; GGT, gamma glutamyltransferase; HbA1c, hemoglobin A1c; HDL, high-density lipoprotein; LDL, low-density lipoprotein; MRI, magnetic resonance imaging; NA, not available; NAS, NAFLD Activity Score; NASH, non-alcoholic steatohepatitis.
J o u r n a l P r e -p r o o f  J o u r n a l P r e -p r o o f Fibrotic NASH was defined as NASH, NAS≥4, and fibrosis stage ≥2.The MAFALDA study has been approved by the Local Research Ethics Committee (no.16/20) and it was conducted in accordance with the principles of the Declaration of Helsinki.All participants gave written informed consent to the study.External validation cohorts Helsinki cohort.A total of 328 consecutive individuals with morbid obesity eligible for bariatric surgery and 42 consecutive individuals with body mass index (BMI) ≥25 kg/m 2 undergoing liver biopsy for suspected NASH were recruited between 2006 and 2018 at J o u r n a l P r e -p r o o f Fibrotic NASH was defined as NASH, NAS≥4, and fibrosis stage ≥2.The study was approved by the Local Research Ethics Committee at Helsinki University Hospital.All participants gave written informed consent to the study.Liver Bible cohort.A total of 947 consecutive individuals with dysmetabolism (at least three criteria among overweight [BMI >25 kg/m 2 ], hypertension [>130/85 mmHg or use of medication], hyperglycemia [>100 mg/dL], low high-density lipoprotein [HDL] cholesterol [<45/55 mg/dL in men/women], and increased triglycerides [>150 mg/dL]) were recruited from July 2019 to July 2021 at the Transfusion Center, Fondazione Ca' Granda Hospital, Milan, Italy. 20,21All participants were 18-65 years old, without history of alcohol abuse (≥30/20 g/day in men/women), chronic viral hepatitis, and other causes of liver disease, and were enrolled as part of a preventive medicine program among blood donors.Liver steatosis and fibrosis were non-invasively assessed by vibration-controlled transient elastography and controlled attenuation parameter (CAP) with FibroScan ® (Echosens, Paris, France), which was performed at the time of biochemical tests.Individuals at-risk of fibrotic NASH were defined as those

Figure 1 .
Figure 1.Diagnostic performance of FNI for fibrotic NASH in the MAFALDA cohort

Figure 2 .
Figure 2. ROC curves for fibrotic NASH by FNI and FIB-4 in the (A) MAFALDA cohort

Figure 3 .
Figure 3. ROC curves for incident severe liver disease by FNI and FIB-4 in the UK

Table 1 . Clinical characteristics of derivation and external validation cohorts.
Continuous variables are shown as mean (SD) or median (IQR) as appropriate.Categorical variables are shown as number (percentage).

Table 3 . Diagnostic performance of FNI and FIB-4 for incident severe liver disease in the UK Biobank (n=305,745).
HRs with 95% CIs were calculated by Cox proportional hazards models.Age, gender, and alcohol intake (g/day) were included in the multivariable models.