Data cleaning in market research
Data cleaning is an important aspect of market research and is linked to
analysis once all the data has been collected. When done correctly, it ensures that all the useful data is separated from the unnecessary. The use of data cleaning helps to improve the quality of the information obtained, and makes it easier to analyse. Furthermore, ignoring the unnecessary data will make the results easier to understand once the research study has been completed.
The process of data cleaning can be performed automatically by software or manually by the researcher. There are main points that the researcher will look for when cleaning data, the first of which is unwanted observations. The second is duplicated observations, so findings that have occurred twice or more that only need to be included once in the overall study. Sorting out irrelevant observations is also key when cleaning data as to keep the data-set as manageable as possible, with as little unnecessary information as possible.
A common method for finding unwanted data is to put all the data into a table or graph which makes it easier to find anomalous results and outliers in the data. This will also mean the wanted data is then well structured and easier to analyse. The graph or chart might, however, show data that is missing which the researcher can either choose to leave out or to make an educated guess as to what the data is. This can increase the validation of the research study; however, the researcher must always state when they have inputted missing data.