Broken Loads from 2020-01-28 to 2020-01-29 [post-mortem]

Summary

On 2020-01-28 09:00 UTC, we deployed a version of Keboola Connection containing a bug. It resulted in loads from transformations to storage were missing our internal _timestamp value. This issue was hard to detect and persisted till 2020-01-29 08:00 UTC. Backfill was applied and all missing _timestamp fields were set to value 2020-01-29 00:00:00 UTC at 2020-01-31 16:30 UTC.  The effect of the tables not having the _timestamp set was that jobs which used this table for incremental loading had no reference for the newest data.

What Happened?

There was an error in our upgrade of the library responsible for loads. An incorrect parameter set resulted in timestamps not being set during load. Such a scenario was not covered by our tests, and this situation was not caught during our peer review process. We immediately deployed the previous functioning version of Keboola Connection as soon as the problem was identified. That itself took about 15 minutes. This was an issue that affected some customers' data so backfill was carefully discussed and tested. Unfortunately, we were also impacted with an issue in the 3rd party build system we use which prevented us from performing the backfill of the missing timestamps on 30th January. Finally, between 2020-01-31 09:30 UTC and 2020-01-31 16:30 UTC all impacted project were back filled.

Timetable

  •  2020-01-28 09:00 Version containing a bug deployed
  •  2020-01-29 08:00 Rollback
  •  2020-01-29 Investigation of issue, impact assessment
  •  2020-01-30 Testing of backfill
  •  2020-01-31 09:30 Start data backfill
  •  2020-01-31 16:30 Data backfill done

What Are We Doing About It?

We're extending the software tests to include more scenarios including test of _timestamp presence on all types of load. We're also working on improving our public incident response to post more frequent updates. 

Original status of issue: https://status.keboola.com/investigating-problems-with-incremental-lods

If your data was affected with this issue, our backfill is not enough for your specific case and you are not in contact with our support yet, feel free to get in touch. Our professional services team will provide all necessary help.