Data cleaning: problems and current approaches

This paper from the University of Leipzig sets out to explain the main problems that data cleaning is able to  correct and then provides an overview of the solutions that are available to implement the cleansing of data. 

This file is a PDF (110 kb)

Contents

  • Data cleaning problems
  • Single-source problems
  • Multi-source problems
  • Data cleaning approaches
  • Data analysis
  • Defining data transformations
  • Conflict resolution
  • Tool support
  • Data analysis and reengineering tools
  • Specialized cleaning tools
  • ETL tools

Sources

Rahm, E., & Hai Do, H. University of Leipzig, Germany, (n.d.). Data cleaning: Problems and current approaches. Retrieved from website: http://wwwiti.cs.uni-magdeburg.de/iti_db/lehre/dw/paper/data_cleaning.pdf

'Data cleaning: problems and current approaches' is referenced in: