Abstract

Data quality issues have been widely recognized in IoT data, and prevent the downstream applications. Improving IoT data quality however is particularly challenging, given the distinct features over the IoT data such as pervasive noises, unaligned timestamps, consecutive errors, misplaced columns, correlated errors and so on. In this tutorial, we review the state-of-the-art techniques for IoT data quality management. In particular, we discuss how the dedicated approaches improve various data quality dimensions, including validity, completeness and consistency. Among others, we further highlight the recent advances by deep learning techniques for IoT data quality. Finally, we indicate the open problems in IoT data quality management, such as benchmark or interpretation of data quality issues.

Authors

Downloads

References

Validity

Constraint Validity

Statistical Validity

Completeness

Constraint-based Imputation

Statistical Model

Deep Learning-based Imputation

Consistency

Pattern-based Detection

Statistical Model

Deep Learning-based Detection