Developing a Big Data Strategy
If you were the Chief Data Officer in your current organization and were assigned with the task to build a strategy that leverages Data Analytics, what would the strategy look like? In case you work for yourself, you can answer this question for a client organization.
Develop a Data Strategy for your current organization
The organization that I am currently in, has a big problem with the balance between sharing and protecting of its data. Various systems within the company generate large amounts of metric and logging data. These types of data are extremely useful in detecting and possibly predicting failure of components which are critical for supporting business functions. However, in order to gain proper insights from this data it is important to share the data between teams within the company. As the company is dealing with privacy sensitive information, it is company policy to apply security measures depending on the sensitivity of the exposed data. Security measures vary from non-existent (public data), data masking to restricting access to specific teams or even persons. As there is no clear policy to classify risks regarding the exposure of metric data, the risk teams, by default, tend to give metric collection systems high sensitivity classifications. Such classifications require teams to implement a strict set of security measures, effectively preventing the sharing of the metric- and logging data. This exposes an additional issue that teams are inexperienced in the practice of acquiring and sharing their data with other teams.
Improvement Proposals
As the Chief Data Officer I would enable teams to share their data with other teams. This will allow them to be able to gain insight from each other’s data. Although it is a small step in creating a more data centric organization, the transformations required to implement an insights-driven organization will also be beneficial when the company is ready to expand its data-driven capabilities in other areas. To develop the capability successfully, two key concerns have to be addressed, namely the skill gap of the employees regarding data-driven practices and the privacy and security regulations within the organization. In 1 it is proposed to have organizations collaborate with educational institutions in order to address the skill gap in data engineering and -science. In my experience it is impossible for people to understand and successfully apply data-driven practices successfully with merely a training. A better way would be to attract experts which can guide and drive the teams in the right direction by a period of intensive training and cooperation, effectively building the practice together. Such an approach has two benefits, namely that people get to work with real experts and gain insight in the thought process and paradigms of data science. Also, by building the practice themselves, the team acquires a greater sense of ownership of the implemented solutions. To address the issue of privacy we are required to follow the law, and as is proposed in 1 is the incorporation of “general best practices”. This is interesting, as we are dealing with possibilities created by the utilization of new technology. These possibilities did not exist before and therefore it can be difficult or even impossible to find “best practices”. It would be better to give teams the freedom to develop their data science practices, whilst at the same time making them responsible for the risks which might emerge from their progress. This approach prevents teams from being held back by defensive departments such as risk. While a risk department´s primary concern is mitigating risk, a team is better capable of finding the balance between both risk and development of new solutions.
Developing a data strategy for the Dutch government
You have been asked to become the Minister of Data Analytics in the Netherlands? What would be the key interventions in Dutch government or society that you would want to implement in the coming 4 years, and why these?
As the Minister of Data Analytics in the Netherlands, I would focus on three main areas, namely education, open data and governance. Although the area of data analytics has gained traction only recently, it is important for the Netherlands as knowledge economy to stay ahead in such developments. Education is a key driver, enabling people to come with their own ideas and having the knowledge to realize them as well. Data analytics can only exist by having data, therefore it is important to be able to easily expose data. At the same time, it is still important to expose data in such a way that it does not cause harm to individuals or the public, therefore governance must be taken into account.
Education
Data-literacy of the public has to be taught before going into any specialized education. Therefore, it is important to start mandatory data-programs in high-schools. Data being the next technological frontier, people should be educated to be critical not only about its risks, but also about its possibilities.
Open Data
Open data initiatives exist in the government already for some time. However, when looking at the datasets exposed at https://data.overheid.nl/data/dataset it can be seen that there are many different formats and suppliers. To enable the people to consume these data, it would be better if there would be a standardized interface based on, for example, REST and JSON. The IT organization of the government could develop a standard, possibly open source, webservice based framework which should be used by any government institution which wishes to expose its data. Standardization on webservices for external data consumers has the additional benefit that it can also be utilized for “internal” data consumers, such as data sharing between government institutions.
Governance
In the past several initiatives regarding data and public health have been halted by public outcry. Ofcourse such initiatives should be well thought of and not taken lightly. However, data-science applied to public (health) data can lead to extremely valuable insights for society as a whole. Because of the potential value, it is quite likely and probably already happening that this kind of data is already being harvested and analyzed by companies. Collecting and exposing such data through a government body would ensure that this data is kept under strict control, while on the other hand also making it available for research in, for example, anonymized form. The Dutch Data Protection Authority (DPA) exists solely to protect and ensure lawful processing of personal data. A good approach would be to heavily involve this organization in development of the “open data webframework” and following implementation efforts.
After four years
By focusing on education, open data and governance, in four years, we might have standardized publicly accessible data and people that understand and know how to work with it. We have built a sustainable environment where data is turned into insights which can drive efficiency in government and strengthen the economic position of the Netherlands. I expect that future efforts will naturally lead to novel applications.
Building a Data Analytics Culture
What would you do specifically to build a Data Analytics culture in your Ministry, that will stimulate decisions to be based on data and analysis?
Decision making processes in ministries affect the lives of many people in the country. It is therefore very important that these decisions are made on a basis which, at the least, attempts to be free of bias. Unfortunately, especially ministries are subject to shifting political views, cultural backgrounds and spatial bias. These issues emerge from various origins. The most obvious being change of leadership every four years after the elections. Cultural bias emerges in the ministries themselves as there is often a difference in education and social class between the people working in the ministries and the people who are affected by their decisions. Finally, spatial bias, is introduced by the fact that most ministries are based in The Hague, however their decisions also have effects in rural areas such as Friesland or Groningen.
Objectives
To build a data-analytics culture in ministries these biases need to be taken into account and challenged. Political biases are the hardest to battle, as it requires a change in te mindset and behavior of the leadership, which are (or used to be) politicians. Government leadership needs to be made aware of their biases and tought to observe without judgment. Keep an open mind and ask questions.
The cultural bias within government can be countered by hiring people from different social and cultural backgrounds into the departments. Ofcourse the work done in ministries requires a certain level of education. This makes such work unsuitable for people of all classes, however by being aware of this bias, government officials may find other methods for engaging with the citizens who are affected by their policy.
Spatial bias, as with cultural bias it might be good to diversify on the origins of the personnel of the ministry. However, it should be taken into account that people working for a longer time in a specific location might “disconnect” them from the places which are affected by their policy. However, in the age of “internet” it shouldn’t be too much of a problem to enable personnel to work remote. By working remote, personnel stays connected with the issues in remote regions and at the same time, is able to use this knowledge for delivering the right information to their department.
Other factors which need to be taken into account when transforming ministries into data-driven organizations are technical and skills related.
The technical platform should enable personnel to experiment and uncover potential bias. Existing initiatives could be extended or migrated to a new platform which fits the requirements of all involved parties.
Besides training personnel in getting insight in their own biases, they can be trained to develop their own models and test them on the technical platform.
Expected Outcomes
By addressing the issues with regard to bias, technical platforms and skills I expect that data-driven decision making will grow within government. Ofcourse there can always be other pitfalls which might surface during the implementation, but focusing on these key issues will give governments a good start in their transformation.
Reflection on organizational efforts to become data-driven
Please reflect on your or your own organization’s efforts to become more data-driven, and build differentiating and more sustainable Data Analytics capabilities. What would you have done differently now, given your improved understanding of driving successful Big Data transformations?
Successfull transformations require adjustments on multiple aspects of the organization. These are adjustments on policy, culture, technique and skills. In the organization where I am consulting at the moment, there seems to be a strong focus on the culture and skills aspect. However, the policies and technical capabilities are underdeveloped. Also it seems that even for the aspects of culture and skills it is hard to nurture the development of real critical mindsets as it is common in “high performing” organizations to present the reality a lot brighter than it actually is. Teams get a basic training to be able to work with data. However, since a good platform is missing, they don’t have the opportunity to work with real data. Obviously, real insights require real and proper data. It is ofcourse not necessarily true for all teams, as the data teams themselves do get access to some data. But it is highly generalized and is hardly used for other things than support of generalized opinions.
As long as leadership is not critical of their own capabilities, that of their teams and -systems, “building a data mindset” within the organization does not go any further than having an army of engineers which can “hello world” python. But after that:
- lack the critical mindset to apply their skills correctly and be critical of their models
- lack the tools to apply their models to production data
- will still be turned down by a risk department which is not aligned with an offensive data-strategy
Looking at these issues, I would still keep parts of the current strategy, however extend it not only to the research (data-science) teams, but especially teams who build and run business-critical applications. To enable teams which can have real world impact, some components need to be in-place.
- A clear open data policy (risk)
- An open data sharing platform (technical)
- Applied experimentation on real-world data (training)
- An open platform where teams can discuss and challenge findings freely (without pressure from management for “delivering results”)
Besides improvements in the organizational strategy I would also take more steps in creating a platform which can actually create value from data. With the investment power of a large organization it should be possible to develop the data-strategy in multiple directions at the same time. As it seems, at the moment there is a strong focus on driving the business through insights. By having teams involved in the decision making process, it should be possible for them to experiment with insights in the real environment (A/B testing). Such experimentation can be achied by implementing (human controlled) triggers in the business-process such that it is easy to test hypotheses. The results of these experiments not only can be utilized to tune the models which drive the business-processes, but also provide valuable information on which processes can actually be replaced by prediction models.
To summarize, I would recommend the organization to not only focus on training and culture, but also put a strong focus on development of the technical platform and an open policy. Besides, I would encourage the teams to do more practical experimentation and think about ways how they can safely experiment and apply their theories in the real environment.
References
- Alharthi, A., Krotov, V., Bowman, M. (2017). Addressing barriers to big data. Business Horizons, 60(3), 285-292
- Schultz, P. W., Milfont, T. L., Chance, R. C., Tronu, G., Luís, S., Ando, K., … & Gouveia, V. V. (2014). Cross-cultural evidence for spatial bias in beliefs about the severity of environmental problems. Environment and Behavior, 46(3), 267-302.