As early as 1957 (according to Wikipedia) there has been the phrase “Garbage in, garbage out” when referring to computer data. Bad data coming in means poor and inaccurate analysis which leads to misleading reports.
I was taught that there are four aspects to programming and working with data.
1. Data access
2. Data management
3. Data analysis
4. Data presentation
And that 80% of your time would be spent on the data management part. Perhaps cleaning the data, merging/joining the data, translating such as going from LastName, FirstName to Firstname LastName, etc.
The author of this article says he spent 90-95% in the Data Management task. To combat that he describes four strategies that will empower your data analysts to transform data into meaningful improvements.
In my opinion, much of it still boils down to: if you have good data, the less time analysts have to spend on the data management task. So allocate people and time to clean up the data first, so that the subsequent downstream employees can concentrate on doing superb analytics.