Insights
Bringing our best ideas and thinking to you.
Blog Post
November 25, 2016
Share this page:
Information Governance in Analytics
By John Desborough
In heavily regulated industries, a lot of the really interesting data that could feed into valuable analytics insights is controlled by local, national or federated regulations. As these regulations and laws have grown up and become more numerous over time, they really do define and control key information created and used in these regulated industries.
Regulated data might be generated by applications (i.e. transactional data) or be generated by human hands (i.e. communications or know-your-client information in the case of banks). It might be structured OR unstructured data. There are two things in common with this data across industries: a) it tends to be the most important, most interesting data in the enterprise, and b) it's covered by more and more complex/stringent regulations.
The government regulations that companies in these regulated industries must follow restrict everything including who can access that data, when and how it needs to be stored, where it must reside (data sovereignty), and when and how (or even if) it can be modified or deleted. Failure to meet these regulations or to provide adequate proof of their compliance, can result in fines, penalties, jail time, etc.
In contrast, most analytics environments have been designed to simplify and improve access to data - to increase both the speed and the volumes at which analytics can be performed. Big problem on the horizon: these environments have not been designed for control of data access, data retention or data segregation - all things necessary for managing a regulated data environment in a compliant way.
Companies have tried to manage these competing priorities to get meaningful insights via a number of different partial fixes, including:
- Small, siloed data lakes that provide limited insight to carefully portioned data. This can fulfill requirements around who can access the data, data sovereignty, and data segregation requirements, but falls far short of the full potential of analytics that are run across large volumes of data from many different sources.
- Archived data duplicated into a separate Hadoop environment. This can meet retention and legal hold requirements by keeping the archive tightly controlled and the analytics environment free of controls, but it results in duplicated data and has raised potential concerns about access control and privacy requirements.
- Hadoop distributions that are beginning to include information governance like retention, access control, etc. This can meet some requirements but significantly increases management complexity and potentially slows down the analytics environment. It also doesn’t cover jurisdictional issues of where data must reside, and may not meet all data privacy and security requirements.
Organizations require a solution that provides a compliant environment with robust information governance to meet global regulations regarding retention, legal hold, access control, security, privacy, sovereignty, and more. A Hundred Answers works with clients to help them address these issues and develop actionable strategies that deliver business outcomes.
--
John Desborough is a Director, Consulting and Technology Solutions at MNP. He is an accomplished business solutions program manager and business transformation architect with 30+ years in the information and technology consulting domain. John has extensive background in information management and governance with both public and private sector clients on a global scale. Drop John a line to discuss this topic in more detail: [email protected]