Why are we, at ED, obsessed with Data Quality?

We collect and store loan-level data as per pre-defined data templates. Each data field has a clear explanation of what the field is supposed to contain, its format, whether it is mandatory or optional. Our repository solution is already capable of rejecting data files that do not adhere to these strong template definitions. If that sounds like a good enough data quality framework, it might be, but not for us. Our data quality management framework has not been built because of, but despite these rigid template definitions. Why? What is the need?

Collecting and disseminating data that complies to the template definitions would, for any repository business, be sufficient. But we are more than a repository platform – we are a market initiative. And as a market initiative, our vision goes beyond collecting and disseminating data.

We want to ensure the data we collect doesn’t merely adhere to template definitions, but more importantly, that it makes sense. Our team proactively looks for contextual errors in the data. This could be a redeemed loan that is missing a redemption date or a loan in arrears for 2 months that is missing an arrears amount. Although these flawed entries would be correct as per template definitions, they wouldn’t make any sense. These errors would make the data effectively useless for investors, rating agencies, regulators and consultants.

Our emphasis on data quality is at the core of our business. Two-thirds of our staff work on data quality related activities everyday – be it analysing data or building data quality rules or tools or interacting with data owners regarding the quality of their data set and possible corrections.

We are proud that the errors we spot today are a mere fraction of what we used to see just a couple of years ago, in-spite of continuously building more checks over time to catch more errors. Although we are very proud of what we have achieved, the greater challenge lies ahead of us. As in tennis or any other sport, reaching the top 10 is much, much easier than being, and remaining, no. 1.

And to be and remain no.1, we must be and remain obsessed with data quality:

For investors to analyse the granular loan by loan data and make better investment decisions,

For rating agencies to increase confidence in their rating models,

For regulators to have a better bigger picture of the credit market,

For consultants to be able to provide accurate value-added services like LTV indices & cash flows

And most importantly, for ourselves, to have a sense of always going the extra mile to serve our clients better.

European DataWarehouse is the only centralised data repository in Europe for collecting, validating and distributing detailed, standardised and asset class specific loan level data for Asset-Backed Securities (ABS) and private whole loan portfolios. ED stores loan-level data and corresponding documentation for investors and other market participants.