-
Research Article
A Comparative Evaluation of Kaplan-Meier, Cox Proportional Hazards, and Random Survival Forests for Neonatal Mortality Prediction
Issue:
Volume 13, Issue 2, December 2025
Pages:
42-59
Received:
2 September 2025
Accepted:
12 September 2025
Published:
27 October 2025
Abstract: Neonatal mortality remains a critical public health challenge, particularly in low- and middle-income countries (LMICs), where limited healthcare resources and fragmented follow-up systems hinder timely interventions. Accurate prediction of neonatal death is essential for risk stratification, resource allocation, and improving survival outcomes. While traditional survival analysis methods like the Kaplan-Meier estimator and Cox proportional hazards (Cox PH) model are widely used, they face limitations in handling non-linear relationships, high-dimensional data, and violations of proportional hazards assumptions. Random Survival Forests (RSF), a machine learning approach, offers potential advantages but lacks sufficient comparative evaluation in neonatal mortality prediction, especially within LMIC contexts. This study aimed to comparatively evaluate the performance of Kaplan-Meier, Cox PH, and RSF models in predicting neonatal mortality using a synthetic dataset reflecting perinatal epidemiology in Kenya. The research addresses a significant and direct methodological comparisons across these models in neonatal populations, particularly under real-world conditions involving censoring, missing data, and non-proportional hazards. We assessed discrimination (C-index, time-dependent AUC), calibration (Integrated Brier Score, CRPS), and clinical interpretability. The dataset included 2,000 neonates with 17 covariates including but not limited to gestational age, birth weight, maternal health, and socioeconomic status. Results showed that RSF outperformed both Kaplan-Meier and Cox PH in discrimination (C-index: 0.875 vs. 0.868) and maintained strong calibration, particularly at 28 days. Variable importance measures identified gestational age, birth weight, and maternal health score as top predictors. SHAP values enhanced interpretability of RSF outputs. The Cox model provided clinically intuitive hazard ratios but was less flexible in capturing interactions. The study concluded that RSF offers superior predictive accuracy for neonatal mortality and should be integrated into risk prediction tools, especially in data-rich settings. Policy makers should support adoption of advanced analytics in perinatal care systems, while maintaining traditional models for inferential clarity. Combining both paradigms can optimize neonatal survival strategies.
Abstract: Neonatal mortality remains a critical public health challenge, particularly in low- and middle-income countries (LMICs), where limited healthcare resources and fragmented follow-up systems hinder timely interventions. Accurate prediction of neonatal death is essential for risk stratification, resource allocation, and improving survival outcomes. Wh...
Show More
-
Research Article
Genetic Diversity and Population Structure of South Ethiopian Arabica Coffee [Coffea arabica L.] Genotypes Using ISSR Markers
Habtamu Gebreselassie*
,
Bizuayehu Tesfaye
,
Andargachewu Gedebo,
Yayis Rezene
Issue:
Volume 13, Issue 2, December 2025
Pages:
60-71
Received:
15 September 2025
Accepted:
29 September 2025
Published:
30 October 2025
Abstract: Arabica coffee originated and diversified in Ethiopia, yet its considerable genetic diversity remains underutilized. This study assessed the genetic diversity and population structure of 50 Arabica coffee genotypes representing five populations (Sidama, Amaro, Jinka, Guji, and improved varieties) using inter-simple sequence repeat (ISSR) markers. The populations produced 74 distinct bands, with improved varieties showing the highest number of private bands (8) and lowest common bands (≤50%) at 6. Band frequency ranged from 8.62% (Guji) to 25.86% (improved varieties), averaging 17.93%. Genetic diversity parameters, including number of alleles per population, effective alleles, Shannon’s information index, observed diversity, and unbiased diversity, ranged from 0.276-0.672, 1.063-1.149, 0.052-0.12, 0.036-0.082, and 0.039-0.092, respectively. AMOVA revealed significant genetic variability, with 67% among populations and 33% within. Principal coordinate analysis explained 42.96% of total variation across three axes. UPGMA cluster analysis grouped the genotypes into four clusters (I-IV) containing 20%, 28%, 12%, and 40% of the genotypes, respectively, with genotypes from the same populations clustering together. Overall, the study demonstrated substantial genetic variation and population structure among South Ethiopian Arabica coffee genotypes, highlighting the potential for conservation and breeding efforts. Future studies should incorporate high-resolution markers and broader accession sets to better capture the genetic landscape of Ethiopian Arabica coffee.
Abstract: Arabica coffee originated and diversified in Ethiopia, yet its considerable genetic diversity remains underutilized. This study assessed the genetic diversity and population structure of 50 Arabica coffee genotypes representing five populations (Sidama, Amaro, Jinka, Guji, and improved varieties) using inter-simple sequence repeat (ISSR) markers. T...
Show More
-
Research Article
Structure-guided Genome-wide Association Analysis of ALK Variants with GWAS Data Using R
Issue:
Volume 13, Issue 2, December 2025
Pages:
72-85
Received:
5 November 2025
Accepted:
18 November 2025
Published:
11 December 2025
DOI:
10.11648/j.cbb.20251302.13
Downloads:
Views:
Abstract: Anaplastic lymphoma kinase (ALK) has been linked to several hematological malignancies; however, its comprehensive genetic variability and potential disease associations are not fully understood. In this study, a structure-guided genome-wide association analysis (GWAS) of ALK variants was performed using publicly available summary statistics and R-based analytical pipelines. The GWAS datasets were acquired, filtered, and ranked based on sample size to ensure sufficient statistical power. A focused analysis on two distinct datasets, which were selected based on sample size and phenotypic diversity: one representing lymphoma-related genetic traits from the UK Biobank, and another capturing ALK-associated proteomic variation. Rigorous quality control and comprehensive data visualization were performed using a set of diagnostic and analytical plots, including volcano plots, QQ plots, histograms, size effects, and a correlation matrix heatmap of numerical variables. Regional Manhattan plots highlighted distinct, highly significant associations at the ALK locus in both datasets, enabling the identification of independent lead variants. Interpretation of the QQ plots and histograms confirmed adequate control for population stratification and minimal inflation of test statistics. Integration of insights from the effect size distribution and SE versus Beta plots provided a clear assessment of the precision and reliability of estimated genetic effects. By mapping genetic variants onto the ALK protein structure, single-nucleotide polymorphisms (SNPs) with potential functional relevance and evaluating their associations with disease phenotypes across populations were prioritized. This strategy facilitates the identification of variants likely to influence protein structure and function, thereby enhancing the interpretability of GWAS findings in a protein-centric context. This approach demonstrates the power of integrating structural bioinformatics with statistical genetics to reveal novel genotype-phenotype relationships, offering valuable insights for precision medicine and targeted ALK-directed therapies. Overall, this integrative methodology establishes a reproducible framework for detailed regional GWAS analyses, successfully pinpointing strong ALK locus associations and identifying candidate variants for subsequent functional validation relevant to the phenotypes, and assessing their potential role in therapeutic investigation for hematological malignancies.
Abstract: Anaplastic lymphoma kinase (ALK) has been linked to several hematological malignancies; however, its comprehensive genetic variability and potential disease associations are not fully understood. In this study, a structure-guided genome-wide association analysis (GWAS) of ALK variants was performed using publicly available summary statistics and R-...
Show More
-
Research Article
Ways to Estimate the Minimal Specified Complexity of Reproduction
Issue:
Volume 13, Issue 2, December 2025
Pages:
72-83
Received:
7 November 2025
Accepted:
22 November 2025
Published:
11 December 2025
DOI:
10.11648/j.cbb.20251302.14
Downloads:
Views:
Abstract: Because of the difficulty of estimating the complexity of biological reproduction of even the simplest cell, the minimal complexities of several simpler self-replicating systems are computed. These include viruses (both biological and computer, which require additional complex resources to replicate). Also considered are a proposed self-reproducing factory and two conceptual self-replicators. Because even some of those are difficult to evaluate, very, very low bounds are computed. Still, in all cases, the minimal complexities are enormous-more than all the quantum state changes in the entire history of the whole known universe in all time. Therefore, because biological reproduction is more complex, all origin-of-life proposals, including RNA hypotheses toward the first self-reproducing cell, must demonstrate how at least such minimal complexity could have accumulated beyond using simply trial-and-error.
Abstract: Because of the difficulty of estimating the complexity of biological reproduction of even the simplest cell, the minimal complexities of several simpler self-replicating systems are computed. These include viruses (both biological and computer, which require additional complex resources to replicate). Also considered are a proposed self-reproducing...
Show More