Many industry leaders have moved beyond initial adoption and are now demonstrating and promoting real value from their data science efforts. But at the same time, many are still struggling to take the first step.
It seems that no matter where you turn nowadays, it's hard to avoid headlines around the benefit of data science and advanced analytics. Industry leaders have moved beyond initial adoption and are now demonstrating and promoting real value from their data science efforts. Captured market shares, more efficient operations, reduced customer churn, better forecasting, avoidance of losses... All on the back of better and more data-driven decision-making. But at the same time, many organizations are still struggling to take the first step. They may have caught the taglines, seen the stats, and noted successes from their competitors. But still aren't sure what initiatives they should focus on to start their analytics journey.
So towards that, we have outlined a set of early-stage data science initiatives as part of this post. With each being built around the people, data, and analytical processes of an organization. There are initiatives here on building a data science team, creating a data science community, collecting new data, and also on assessing and implementing a suitable data science toolset. And note that while these initiatives are meant to be generic enough that they remain relevant, no matter the size or domain of the organization. They are also meant to be detailed enough that they can fuel some thought and help prompt those organizations who are taking the first steps on their analytics journey.
Build a team of data science professionals
Plan for, and build a data analytics team which can support the organization’s current, but also future needs.
Communicate the value of data
Ensure the organization understands the value of data, and that practices are followed to collect, store and retain current sources.
Adopt a standard set of tools for data science
Whilst flexibility is key for many problems, there is real benefit to aligning on, and supporting a standard set of analytical tools.
Create and support a data science community
Data scientists shouldn’t be the only ones exposed to and making use of data analytics. Create a community to support Citizen Data Scientists.
Identify and secure new sources of data
Position the organization to capture new and relevant data. Not just for current opportunities, but considering future opportunities.
Formalize and promote a data science workflow
Establish and promote a data science workflow which is relevant for your organization, including its opportunities, tools and people.
Create a broad data analytics awareness program
Data analytics concepts and fundamentals should be shared across the organization. Understanding builds capability and trust.
Consolidate data into a Data Warehouse/ Lake
End data silos, and bring the organizations data together into consolidated data warehouses or a data lake.
Identify and tackle data science use cases
Work with functional areas to identify, prioritize and scope data science opportunities.
We hope you find this to be a valuable resource. And please don't hesitate to reach out to us here at Datakick Analytics for any help in formulating, planning or executing your data science strategy.
PEOPLE
Build a team of data science professionals
Plan for, and build a data analytics team which can support the organization’s current, but also future needs.
- Define the roles of the team: Define roles for analytical talent. This may simply include generalist data scientist roles, but may also include specialist roles based on the type, scale and complexity of problems the organization faces.
- Set career maps and targets for growth: Establish a career map and set clear targets for growth of data scientists. Ensure targets align with the goals of the organization, not merely on the individuals skill and ability in using the latest and most complex data science techniques.
- Attract and retain talent: Work to attract, but also retain talent. Quality data science resources are scarce. However, data scientists thrive on solving challenging problems and want to be part of organizations which have a strong data analytics culture, and where their work is valued and used.
Create and support a data science community
Data scientists shouldn’t be the only ones exposed to and making use of data analytics. Create a community to support Citizen Data Scientists.
- Create data science community platforms: Build the foundations of your data science community on solid platforms for sharing and collaboration. The community is not just for the data scientists, but for the wider organization.
- Create community focus areas and designate champions: Define a set of community focus areas and designate community champions for each. Areas may be based on functional need, or by broad data science application areas within your business.
- Host community activities and support opportunities: Host activities and support analytical opportunities. These initiatives will help expose data science use-cases right across the organization and go a long way towards cultivating the organization’s data driven culture.
Create a broad data analytics awareness program
Data analytics concepts and fundamentals should be shared across the organization. Understanding builds capability and trust.
- Determine current analytics awareness: Determine the current state of analytics awareness, as well as future requirement across the organization. This may include use of feedback from functional leads to understand how they have, as well as how they intend to use data analytics.
- Create a suitable data analytics training curriculum: Create a suitable curriculum for domain experts, leaders, and other non-technical staff. Curriculum should membership for current awareness, their domain area, anticipated challenges, required skills, etc.
- Set and monitor targets for training: Work with functional leads to set training targets at the functional level, and have supervisors and line-managers assign training and set targets at the individual level.
DATA
Communicate the value of data
Ensure the organization understands the value of data, and that practices are followed to collect, store and retain current sources.
- Promote the role of data: The success of any data science effort is highly leveraged to the quality and quantity of relevant data. Make sure this is known, and start by clearly promoting the role and importance data plays in building data science solutions.
- Make sure your data is being retained: You may not be leveraging the data in your current state, but that doesn't mean that you won't in the future. So make sure organizational members aren't throwing out, giving away, or missing opportunities to collect data.
- Clean and organize your data: Ensure everyone in the organization who is dealing with data is employing good data practices. Relate common sources of data, correct data quality issues and errors, migrate data from flat files to proper data management systems, and ensure data is being appropriately catalogued and documented for future use.
Identify and secure new sources of data
Position the organization to capture new and relevant data. Not just for current opportunities, but considering future opportunities.
- Be forward thinking on future analytics efforts: The variety, complexity and amount of data science problems the organization faces will constantly grow. So it's important to identify relevant data sources not just for the problems you are faced with now, but for the future.
- Treat key decisions and judgements as data: Make sure relevant observations, judgements and decisions made by domain experts are being captured. These data points are typically more difficult to capture if not already recorded as part of an existing process. But they can be extremely valuable.
- Implement new processes and technology if necessary: The organization may need to consider new processes or technology to capture data. Robotic process automation for example can provide a great means for collecting certain data sources, not just for process automation.
Consolidate data into a Data Warehouse/ Lake
End data silos, and bring the organizations data together into consolidated data warehouses or a data lake.
- Assess data storage options for the organization: There are many vendors offering data warehouse/ lake solutions, and the best solution isn't going to depend on just one factor. Think through the types of data regularly worked with, requirement for an on-premise vs. cloud solution, or whether faced with strict information security requirements.
- Identify data sources which can be consolidated: Traditionally, data is siloed across the organization, whether that's by business unit, function or team. So work to identify data sources which should be brought into a consolidated lake for the organization.
- Design storage systems with a view of how you will use the data: Take a proactive approach to designing your data lake according to how you intend on using the data. Data volume, type, frequency and speed of membership all come into play here.
ANALYTICS
Adopt a standard set of tools for data science
Whilst flexibility is key for many problems, there is real benefit to aligning on, and supporting a standard set of analytical tools.
- Assess data science tools for your organization: Consider whether the data science team should focus on developing custom solutions on top of open-source frameworks/ tools. Or whether they would be better served by adopting a more fully-featured and proprietary platform.
- Provide a solution for model deployment and monitoring: Solutions typically need to be deployed in order to provide real value. And while a proprietary data science platform may provide a means to automate deployment tasks, custom solutions will need to take a much more proactive approach.
- Implement, support, train and set targets for use: The organization’s data science toolset will no doubt be used by more than just the data scientists. Citizen Data Scientists and domain experts will all benefit from using parts of the data science toolset.
Formalize and promote a data science workflow
Establish and promote a data science workflow which is relevant for your organization, including its opportunities, tools and people.
- Formalize a workflow: Formalize and promote the critical steps and considerations to be made when tackling a targeted data science opportunity. A data science workflow can be a great tool for training, but also for ensuring quality and consistency of approach.
- Map elements to the workflow: There are a variety of roles and different sets of technology involved in tackling a data science problem. So, it’s important to map the roles and responsibilities for members, as well as key data, tooling and technology, to appropriate steps of the workflow.
- Document workflows which demonstrate best practices: Document workflow examples which demonstrate best practices. Use examples where the organization has successfully applied data science to solve real-world business problems. Include data sources, use of tools, custom architectures etc.
Identify and tackle data science use cases
Work with functional areas to identify, prioritize and scope data science opportunities.
- A framework for data science opportunities: Develop a framework for data science opportunity identification and prioritization. Include prescription of elements for a successful initiative, such as adequacy of data, and ability to integrate into existing processes.
- Expose the framework and identify opportunities: Expose the framework, and work with functional leads and domain experts to identify opportunities. They may be top-line or bottom-line driven. The most valuable use-case options will no doubt come from those who work closest to the problem.
- Prioritize, scope and tackle opportunities: Prioritize opportunities using the framework according to those which add the most business value and those which can be addressed with available resources. Then, form scrums to tackle opportunities using the data science workflow, but in an agile way.