Last Week we had a funny case which took us much time to understand.
In our development environment, we like to have the same source data for a long period of time. This way we're familiar with our data and we can expect the numeric results in the higher layers (OLAP, dashboard, etc.). When we have a stable environment and we think that our system gives the correct data, we ask our infrastructure team to refresh our sources, meaning that they bring new production data to the development environment.
The situation that we had last week is that we refreshed our sources and from that point the data didn't make any sense. We asked the infrastructure team if they're sure they refreshed the data and they said they are sure. We checked the system and the data many times and seen that the data is different from what we had but still it didn't make any sense.
After much frustration, one of us made a suggestion to write sql query in this form:
select max(some_date_field) from fact_table
Then we were all amazed. After much time of sisyphean work we realized that our data is not new at all. Moreover - our old data was old too. The maximum date in our main fact table was half a year ago. The reason was that we took our data from a legacy system which runs a job that exports their tables into files, and our infrastructure team imports these files into our source tables. The problem was that no-one ran the job so the files themselves were old, meaning that our refresh activity means nothing.
Conclusion: Never trust your sources. Always run a sql query like mentioned above to check that you have recent data and that the data you think you have is the data you have. You can think of more queries that validates that what you have is what you expect to have. Run them every morning to make sure you're ground is stable. It's only a matter of seconds and it's worth it.
May the source be with you.