The rapid growth in data science has bought with it an explosion in related roles and job titles.
The rapid growth in data science has bought with it an explosion in related roles and job titles. Not only that, but as the profession ages, so too are the distinctions of seniority and experience for those in data related roles. But what's the distinction between a Data Scientist and a Senior Data Scientist? How should Citizen Data Scientists interact with Data Scientists? What role should a Data Science Lead play?
Obviously, the nature of a growing field means that these roles will continue to evolve over time. But that puts even more emphasis on organizations to be proactive in defining clear and logical development pathways for their data science functions. There needs to be a natural progression in skill sets, a solid place in the organization for any formalized specializations, and a clear mapping of what tools, platforms, and techniques each role should be capable of leveraging.
Our take on a data science development pathway
And so towards that, we have looked to form a data science development pathway for some of the more 'generalist' data science roles. It's not intended to be a one size fit all model. As organizations will need to carefully define roles which are suited to their business. But we do hope this provides a useful starting point, particularly for organizations just starting to grow their data science capability.
Citizen Data Scientist
What is a Citizen Data Scientist?
Identified as a subject-matter expert in their field
Primary job function lies outside the fields of statistics or analytics
Uses statistical and data science techniques in support of their primary role
Understands when to involve, and how to best leverage dedicated data science resources
What knowledge and skills?
Able to extract insights through data visualization or statistical methods
Understands the data science workflow
Able to frame data science problems and identify necessary elements
Able to identify and make use of suitable data science techniques from existing solutions
Able to identify data limitations or errors
Able to identify and correct bias in results
What tools and platforms?
Preference towards no-code or low-code data science tools
Preference towards automated and pre-scripted methodologies, e.g. auto- machine learning
Preference towards being a ‘consumer’ of developed and tested tools and methods, rather than needing to create
Data Scientist
What is a Data Scientist?
Applies existing and creates new data science solutions to solve scalable problems
Knows when and how to apply data science techniques based on practical experience gained across a range of problems
Works with Citizen Data Scientists to identify and address gaps in existing solutions
Provides support to Citizen Data Scientists
What knowledge and skills?
Able to work with structured and many page of unstructured data
Able to leverage a wide range of supervised and unsupervised Machine Learning techniques
Able to script and automate tasks which are part of the data science workflow
Able to build self-learning data science pipelines, which are able to handle new data and problem variations
What tools and platforms?
Comfortable in using programming or scripting languages suitable to applying data science techniques (e.g. Python or R)
Comfortable in leveraging existing open source libraries relevant to data science
Able to work with and provide solutions which can be integrated into toolsets used by Citizen Data Scientists
Senior Data Scientist
What is a Senior Data Scientist?
Builds highly customized data science solutions to solve specific, but challenging problems
May have developed a specialization around a particular analytical field or application area, e.g. use of Artificial Intelligence, Natural Language Processing, time series analysis
Provides review and assurance that data science best practices have been followed
Acts as a mentor and guide for Data Scientists
What knowledge and skills?
Able to build detailed data science workflows, which combine numerous techniques to solve complex problems
Understands advanced data science methods, such as Deep Learning and Reinforcement Learning
Able to make appropriate use of big-data and/or scalable compute technologies
What tools and platforms?
Comfortable in using and extending the core functionality of open sources libraries relevant to data science
Able to design and create custom tools and interfaces suited to specific problems
Data Science Lead
What is a Data Science Lead?
Identifies data analytic opportunities which are aligned with organizational goals
Plans large-scale data science project, including scope, timing, and matching necessary skillsets
Supervises data science resources in pursuit of project goals
Maintains connections to data science leads from within and outside their own industry
Forms new and cultivates existing relationships with vendors and technology providers
Provides input towards formation of the businesses data analytics strategy
Leads recruiting efforts for data science talent
Creates training and educational pathways for data science resources
What knowledge and skills?
Excellent interpersonal and communication skills
Has a broad understanding of data science tools and techniques which are relevant to their business
Able to take a high level view of a data science opportunity, and make recommendations of suitable approach or methodology
Expert at communicating complex theories and methodologies to leadership and non-technical members