Hybrid approach to reducing estimating overfitting and collinearity

Bo Xiong, Sidney Newton, Vera Li, Martin Skitmore*, Bo Xia

*Corresponding author for this work

Research output: Contribution to journalArticleResearchpeer-review

5 Citations (Scopus)
65 Downloads (Pure)


Purpose: The purpose of this paper is to present an approach to address the overfitting and collinearity problems that frequently occur in predictive cost estimating models for construction practice. A case study, modeling the cost of preliminaries is proposed to test the robustness of this approach. 

Design/methodology/approach: A hybrid approach is developed based on the Akaike information criterion (AIC) and principal component regression (PCR). Cost information for a sample of 204 UK school building projects is collected involving elemental items, contingencies (risk) and the contractors’ preliminaries. An application to estimate the cost of preliminaries for construction projects demonstrates the method and tests its effectiveness in comparison with such competing models as: alternative regression models, three artificial neural network data mining techniques, case-based reasoning and support vector machines.

Findings: The experimental results show that the AIC–PCR approach provides a good predictive accuracy compared with the alternatives used, and is a promising alternative to avoid overfitting and collinearity. 

Originality/value: This is the first time an approach integrating the AIC and PCR has been developed to offer an improvement on existing methods for estimating construction project Preliminaries. The hybrid approach not only reduces the risk of overfitting and collinearity, but also results in better predictability compared with the commonly used stepwise regression.

Original languageEnglish
Pages (from-to)2170-2185
Number of pages16
JournalEngineering, Construction and Architectural Management
Issue number10
Publication statusPublished - 18 Sept 2019
Externally publishedYes


Dive into the research topics of 'Hybrid approach to reducing estimating overfitting and collinearity'. Together they form a unique fingerprint.

Cite this