CUTOFF: A spatio-temporal imputation method

Lingbing Feng, Gen Nowak, T. J. O'Neill, A. H. Welsh

Research output: Contribution to journalArticleResearchpeer-review

8 Citations (Scopus)

Abstract

Missing values occur frequently in many different statistical applications and need to be dealt with carefully, especially when the data are collected spatio-temporally. We propose a method called CUTOFF imputation that utilizes the spatio-temporal nature of the data to accurately and efficiently impute missing values. The main feature of this method is that the estimate of a missing value is produced by incorporating similar observed temporal information from the value's nearest spatial neighbors. Extensions to this method are also developed to expand the method's ability to accommodate other data generating processes. We develop a cross-validation procedure that optimally chooses parameters for CUTOFF, which can be used by other imputation methods as well. We analyze some rainfall data from 78 gauging stations in the Murray-Darling Basin in Australia using the CUTOFF imputation method and compare its performance to four well-studied competing imputation methods, namely, k-nearest neighbors, singular value decomposition, multiple imputation and random forest. Empirical results show that our method captures the temporal patterns well and is effective at imputing large gaps in the data. Compared to the competing methods, CUTOFF is more accurate and much faster. We analyze further examples to demonstrate CUTOFF's applications to two different data sets and provide extra evidence of its validity and usefulness. We implement a simulation study based on the Murray-Darling Basin data to evaluate the method; the results show that our method performs well in both accuracy and computational efficiency.

Original languageEnglish
Pages (from-to)3591-3605
Number of pages15
JournalJournal of Hydrology
Volume519
Issue numberPD
DOIs
Publication statusPublished - 27 Nov 2014

Fingerprint

statistical application
method
capture method
basin
decomposition
rainfall
simulation
station
parameter

Cite this

Feng, Lingbing ; Nowak, Gen ; O'Neill, T. J. ; Welsh, A. H. / CUTOFF : A spatio-temporal imputation method. In: Journal of Hydrology. 2014 ; Vol. 519, No. PD. pp. 3591-3605.
@article{9e27d43aabb546c099637dcdc1191a7f,
title = "CUTOFF: A spatio-temporal imputation method",
abstract = "Missing values occur frequently in many different statistical applications and need to be dealt with carefully, especially when the data are collected spatio-temporally. We propose a method called CUTOFF imputation that utilizes the spatio-temporal nature of the data to accurately and efficiently impute missing values. The main feature of this method is that the estimate of a missing value is produced by incorporating similar observed temporal information from the value's nearest spatial neighbors. Extensions to this method are also developed to expand the method's ability to accommodate other data generating processes. We develop a cross-validation procedure that optimally chooses parameters for CUTOFF, which can be used by other imputation methods as well. We analyze some rainfall data from 78 gauging stations in the Murray-Darling Basin in Australia using the CUTOFF imputation method and compare its performance to four well-studied competing imputation methods, namely, k-nearest neighbors, singular value decomposition, multiple imputation and random forest. Empirical results show that our method captures the temporal patterns well and is effective at imputing large gaps in the data. Compared to the competing methods, CUTOFF is more accurate and much faster. We analyze further examples to demonstrate CUTOFF's applications to two different data sets and provide extra evidence of its validity and usefulness. We implement a simulation study based on the Murray-Darling Basin data to evaluate the method; the results show that our method performs well in both accuracy and computational efficiency.",
author = "Lingbing Feng and Gen Nowak and O'Neill, {T. J.} and Welsh, {A. H.}",
year = "2014",
month = "11",
day = "27",
doi = "10.1016/j.jhydrol.2014.11.012",
language = "English",
volume = "519",
pages = "3591--3605",
journal = "Journal of Hydrology",
issn = "0022-1694",
publisher = "Elsevier",
number = "PD",

}

CUTOFF : A spatio-temporal imputation method. / Feng, Lingbing; Nowak, Gen; O'Neill, T. J.; Welsh, A. H.

In: Journal of Hydrology, Vol. 519, No. PD, 27.11.2014, p. 3591-3605.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - CUTOFF

T2 - A spatio-temporal imputation method

AU - Feng, Lingbing

AU - Nowak, Gen

AU - O'Neill, T. J.

AU - Welsh, A. H.

PY - 2014/11/27

Y1 - 2014/11/27

N2 - Missing values occur frequently in many different statistical applications and need to be dealt with carefully, especially when the data are collected spatio-temporally. We propose a method called CUTOFF imputation that utilizes the spatio-temporal nature of the data to accurately and efficiently impute missing values. The main feature of this method is that the estimate of a missing value is produced by incorporating similar observed temporal information from the value's nearest spatial neighbors. Extensions to this method are also developed to expand the method's ability to accommodate other data generating processes. We develop a cross-validation procedure that optimally chooses parameters for CUTOFF, which can be used by other imputation methods as well. We analyze some rainfall data from 78 gauging stations in the Murray-Darling Basin in Australia using the CUTOFF imputation method and compare its performance to four well-studied competing imputation methods, namely, k-nearest neighbors, singular value decomposition, multiple imputation and random forest. Empirical results show that our method captures the temporal patterns well and is effective at imputing large gaps in the data. Compared to the competing methods, CUTOFF is more accurate and much faster. We analyze further examples to demonstrate CUTOFF's applications to two different data sets and provide extra evidence of its validity and usefulness. We implement a simulation study based on the Murray-Darling Basin data to evaluate the method; the results show that our method performs well in both accuracy and computational efficiency.

AB - Missing values occur frequently in many different statistical applications and need to be dealt with carefully, especially when the data are collected spatio-temporally. We propose a method called CUTOFF imputation that utilizes the spatio-temporal nature of the data to accurately and efficiently impute missing values. The main feature of this method is that the estimate of a missing value is produced by incorporating similar observed temporal information from the value's nearest spatial neighbors. Extensions to this method are also developed to expand the method's ability to accommodate other data generating processes. We develop a cross-validation procedure that optimally chooses parameters for CUTOFF, which can be used by other imputation methods as well. We analyze some rainfall data from 78 gauging stations in the Murray-Darling Basin in Australia using the CUTOFF imputation method and compare its performance to four well-studied competing imputation methods, namely, k-nearest neighbors, singular value decomposition, multiple imputation and random forest. Empirical results show that our method captures the temporal patterns well and is effective at imputing large gaps in the data. Compared to the competing methods, CUTOFF is more accurate and much faster. We analyze further examples to demonstrate CUTOFF's applications to two different data sets and provide extra evidence of its validity and usefulness. We implement a simulation study based on the Murray-Darling Basin data to evaluate the method; the results show that our method performs well in both accuracy and computational efficiency.

UR - http://www.scopus.com/inward/record.url?scp=84918800733&partnerID=8YFLogxK

U2 - 10.1016/j.jhydrol.2014.11.012

DO - 10.1016/j.jhydrol.2014.11.012

M3 - Article

VL - 519

SP - 3591

EP - 3605

JO - Journal of Hydrology

JF - Journal of Hydrology

SN - 0022-1694

IS - PD

ER -