Manifesting construction activity scenes via image captioning

Huan Liu, Guangbin Wang, Ting Huang, Ping He, Martin Skitmore, Xiaochun Luo*

*Corresponding author for this work

Research output: Contribution to journalArticleResearchpeer-review

49 Citations (Scopus)
85 Downloads (Pure)


This study proposed an automated method for manifesting construction activity scenes by image captioning – an approach rooted in computer vision and natural language generation. A linguistic description schema for manifesting the scenes is developed initially and two unique dedicated image captioning datasets are created for method validation. A general model architecture of image captioning is then instituted by combining an encoder-decoder framework with deep neural networks, followed by three experimental tests involving the selection of model learning strategies and performance evaluation metrics. It is demonstrated the method's performance is comparable with that of state-of-the-art computer vision methods in general. The paper concludes with a discussion of the feasibility of the practical application of the proposed approach at the current technical level.

Original languageEnglish
Article number103334
JournalAutomation in Construction
Early online date6 Jul 2020
Publication statusPublished - Nov 2020
Externally publishedYes


Dive into the research topics of 'Manifesting construction activity scenes via image captioning'. Together they form a unique fingerprint.

Cite this