Hepatocellular carcinoma (HCC) is associated with poor prognosis and fluctuations in immune status. Although studies have found that secreted phosphoprotein 1 (SPP1) is involved in HCC progression, its independent prognostic value and immune-mediated role remain unclear. Using The Cancer Genome Atlas and Gene Expression Omnibus data, we found that low expression of SPP1 is significantly associated with improved survival of HCC patients and that SPP1 expression is correlated with clinical characteristics. Univariate and multivariate Cox regression confirmed that SPP1 is an independent prognostic factor of HCC. Subsequently, we found that T cell CD4 memory-activated monocytes, M0 macrophages, and resting mast cells showed significant differences in penetration in the high and low SPP1 expression groups. Next, we used the Weighted Gene Co-Expression Network and Least Absolute Shrinkage Sum Selection Operator algorithms to construct a risk score for the 9-immune-related genes signature. The risk score showed a good ability to identify high and low-risk patients and improved survival prediction. We also used multivariate Cox regression to validate that risk score was significantly correlated with SPP1 and overall survival. Lastly, the Back-Propagation Neural Network confirmed the reliability of the results of multiple algorithms. In conclusion, the findings suggest that SPP1 is an independent marker of HCC survival and immunotherapy.

Hepatocellular carcinoma (HCC) ranks among the top malignant tumors in terms of both morbidity and mortality (

Recent research on various tumor immune escape mechanisms has enabled immunotherapy to suppress the development of malignant tumors (

The gene that encodes secreted phosphoprotein 1 (SPP1), also known as osteopontin (osteopontin), is located on human chromosome 4q22.1 (

In our previous studies, we identified various biomarkers and potential therapeutic targets that modulate inflammation using experimental and bioinformatics methods. This approach relies on literature searches (PubMed) and online software (Gene Expression Profiling Interactive Analysis, GEPIA2, and Encyclopedia of RNA Interactomes) to determine the

HCCseq-Counts and clinical information were downloaded from the Cancer Genome Atlas (TCGA) database (

The ESTIMATE algorithm can evaluate the immune abundance and tumor purity of the tumor microenvironment. The CIBERSORT conversion method uses the CIBERSORT function to perform statistical analysis on the transcriptome sequencing expression profile of complex tissues (such as large solid tumors). It uses the anti-stacking method to remove the unknown mixture content to estimate the relative proportion of 22 immune cell subpopulations. In this study, we used the CIBERSORT analysis tool (

For DEGs screening and WGCNA analysis, we first used the Limma package to screen DEGs (SPP1 high expression group

A β value that conforms to the law of a scale-free network was set with R^2 = 0.85, and the adjacency matrix A was constructed.

Subsequently, the adjacency matrix was converted into a topological overlap matrix TOM.

Finally, hierarchical clustering of the representative genes in all modules-feature vector genes (ME) was performed to construct module membership (MM).

Univariate COX regression was used to initially screen candidate genes closely related to patient survival (

Subsequently, the model gene signature combination establishes a risk score (RS) to determine the risk characteristics of the patient. Among them, n represents the number of genes, COEF represents the multi-variable COX regression coefficient of gene I, and EXP represents the expression value of gene I.

We incorporated immune features and RS obtained by different algorithms into multivariate COX analysis to screen for significant candidate features. Subsequently, through the artificial neural network function of MATLAB, candidate features were reverse-verified in-depth. BPNN is a hierarchical neural network comprising input, hidden, and output layers. This study used BPNN to change the weights to reduce the error between the predicted results and the determined output. The output result adopts the forward propagation, and the error adopts the backward propagation method. The process was repeated in an iterative mode until the error fell below the set threshold. The algorithm process was as follows:

(1) Normalization of eigenvalues.

(2) The input function of the neuron is as follows:

Among them, w is the weight of connecting the two layers of neurons before and after, and b is the bias of the hidden layer.

(3) The output function of the neuron is as follows:

(4) To find the parameter with the smallest error, BPNN must find the smallest error according to the direction of the negative gradient. The error formula was expressed as follows:

where s refers to the actual value (target value), t refers to the output value of the neural network, and o_i refers to i output layers.

We performed all bioinformatics analyses using the R v3.6.1 environment.

The prognostic performance and clinical value of the expression of SPP1 in HCC were significantly higher than in normal tissues (

To infer the independent prognostic value of SPP1, we included clinical information (age, gender, and stage) and SPP1 in the Cox regression analysis. Univariate Cox regression using the TCGA dataset showed that SPP1 (hazard ratio (HR) = 1.127,

To understand immune fluctuations in HCC, we evaluated the composition of immune cells in different expression groups of SPP1. The TCGA results showed that CD4 naïve T cells (

Using the level of SPP1 expression as grouping information, we identified 819 up-regulated and 128 down-regulated genes (

We identified 263 IRGs significantly related to overall survival (OS) through univariate Cox regression analysis. Subsequently, LASSO Cox in-depth analysis obtained 10 prognostic signatures of IRGs (

We used the median risk score (RS) value to divide TCGA (Nhigh-risk = 183, Nlow-risk = 182) and GEO (Nhigh-risk = 121, Nlow-risk = 121) patients into high-risk and low-risk groups. In the TCGA cohort, low-risk patients had a significantly longer lifespan (

Next, we incorporated the immune features and risk features obtained by the different algorithms into the multivariate Cox regression analysis. The results showed that ImmuneScore (HR = 0.99918,

We used the BPNN model to further validate the risk score of multiple algorithms (ESTIMATE, CIBERSORT, WGCNA, and LASSO Cox). We first used the candidate features as input and the corresponding expression of SPP1 as output for BPNN prediction analysis. Subsequently, we divided the 365 samples into a training set (N = 255), validation set (N = 55), and test set (N = 55) to implement the neural network. The mean square error (MSE) was 0.00019679, and the best performance of the BPNN model was obtained at 5 epochs (

Immune regulation is vital in predicting tumor progression and prognosis (

To understand the relationship between SPP1 and tumor immunity, we used an expression matrix to evaluate the level of immune penetration in the HCC microenvironment. Based on the analysis of immune data from different cohorts, we found significant differences in CD4 memory-activated T cells, monocytes, M0 macrophages, and resting mast cells. Notably, these immune cells are significantly related to the expression of SPP1. CD4 memory-activated T cells are the primary surface marker of Th cells, which enhance the anti-infective effect mediated by phagocytes and the humoral immune response mediated by B lymphocytes. They also play an essential role in assisting CD8+ T cells and B cells in tumor immunity (

Based on the expression data of 947 DEGs, we identified 453 IRGs through the WGCNA analysis. We then constructed the risk characteristics of 9-IRGs through the stepwise Cox and LASSO algorithms to assess improvement in patient survival. Pre-built models using high-dimensional sequencing data may show over-fitting, and the LASSO penalty mechanism can avoid this shortcoming (

The 9-IRGs (C5orf30, DNAJC6, MMP1, RGS20, S100A9, SLC1A5, SLC2A1, SOX11, STC2) signatures we constructed showed good survival prediction effects. Among them, MMP1 had the most prominent predictive ability and is an independent risk factor for HCC. Matrix metalloproteinase-1 (MMP1) is a carcinogen associated with the progression of HCC, and its high expression corresponds to a poor prognosis (

With updating information generation methods and diversification, single algorithms (such as WGCNA and LASSO) often have unsatisfactory predictions and can no longer meet demands (

Our results support that SPP1 is an independent marker of overall survival in HCC, with low expression being significantly associated with improved survival. Using various algorithms, we could confirm that SPP1 is significantly related to the immune microenvironment. Subsequently, based on the SPP1 expression grouping and immune characteristics, the risk score of the 9-IRG signature was established with excellent survival prediction performance. The use of BPNN further validated the reliability of the score of multiple algorithms. Therefore, SPP1 is a potential marker for HCC survival prediction. This was an exploratory study; therefore, the application value of SPP1 should be further verified by prospective multi-center clinical trials.

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

The authors confirm contribution to the paper as follows: draft manuscript preparation: Wenli Zeng; data collection: Feng Ling, Kainuo Dang; study conception and design: Qingjia Chi. All authors reviewed the results and approved the final version of the manuscript.

This work was supported by

All the authors declared no potential conflicts of interest.

^{+}T cells correlate with exhausted signature and poor clinical outcome in hepatocellular carcinoma