Automated Data Cleaning Via Logic

Research areas

Temporary Supervisor

Professor Rajeev Gore


Data is usually collected as tables, but such tables usually contain many errors due to mistyping or just misunderstanding of questions. For example, a record may claim that a 5 year old is married. The Fellegi-Holt method of data cleaning is a standard way to find the minimal changes required to correct a record. We have shown that the essence of the Fellegi-Holt method of data cleaning is an old method from automated deduction called propositional resolution.


The project is to implement a prototype for the Fellgi-Holt method of data cleaning using fast SAT solvers or fast consequence finders.


A good background in maths will be useful.


There is a high chance that this could lead to a conference publication and/or a Phd here working on data cleaning via logic.


data cleaning, constraint satisfaction, resolution

Updated:  1 June 2019/Responsible Officer:  Dean, CECS/Page Contact:  CECS Marketing