The data scientist of the future - featured

The Data Scientist of the Future

A recurring question that I often get on Quora is: “What will a Data Scientist job of the future entail?”

In this article, I’ll share with you how I think the Data Science field will evolve and what will be the direct impact on the required skill set. I’ll also give you insights on how you can prepare today to be in demand in the future.

Data Scientists are a Temporary Patch

Before the Data Science era, there was a couple of fields who were all utilizing data but not really working in harmony:

  • Mathematics (algorithms, linear algebra, calculus, etc)
  • Statistics (Hypothesis testing, central limit theorem, etc)
  • Information Science (data analysis, data manipulation, etc)
  • Computer Science (Automation, Data Storage, Processing of Big Data, etc)

We already know that data is growing at an amazing pace, as shown on the chart below.

Data Growth - Data Sciencist of The Future
Source: Patrick Cheesman

When businesses started to realize that they were sitting on a pile of data – a gold mine – they started to seek insights.

They wanted to leverage their data to better understand their customers and make more money. One popular example of this is the amazon recommendation system:

Amazon recommendation - Data Scientist of the Future

But who could build such a complex recommendation system? Well, those super smart geeks, The Big Bang Theory kind, who were skilled enough in the 4 different fields mentioned above.

Since the skill set needed to complete such a project is multidisciplinary, they couldn’t simply call themselves Statisticians, of Mathematicians, or Programmers. They needed a new title and Data Science was born.

Data Scientists are a patch to a system that was designed for operations rather than knowledge.

Automating The Data Scientist of The Future

20 years ago, packages for Linear Regression were a lot less developed than today. Python started to be used around 1989 and R was born in 1993. I haven’t verified this, but I bet most linear regression algorithms (if any were in production) were written from scratch.

Today, a smart 7 years old kid could run linear regression with 3 lines of code. Of course, he wouldn’t know the assumptions of linear regression, how to interpret the results, and tune the model but still.

The reality though is that over time all 3 of those things could becaume be automated. Let AI check if the assumptions are respected, the quality of the model, and tune it if necessary. I believe that The Data Scientist of the future will rely on AI to decide which model is right for the data, fit the model, and auto tune it. The stats/math knowledge necessary will therefore be way lower.

When we reach that point, I believe that Data Science will break out into two distinct categories:

  • Data Science Research
  • Applied Data Science

Data Science Researchers will be a minority. They will be the math/stats, highly theoretical people who create new algorithms, improve and simplify packages, and push the boundaries.

Applied Data Science will be the majority. This will be similar to a Data Engineer role today, plus the basic statistics of the future, which will include the understanding of main algorithms and how to interpret their high level new metrics.

Overall, my conclusion is that automation will close the theoretical and technical gap. The need for those skills in The Data Scientist of the future will decrease quickly.

What about your Career in Data Science?

If the Data Science Research field is attracting to you, then you should concentrate on developing skills to become highly theoretical and technical. If on the contrary you would prefer to work for a business then there are 2 major skills that can help save and improve your career:

  1. Consultancy skill: Even if AI runs a bunch of algorithms by itself, business orientated people most likely wont understand the results. Your capacity to explain what AI is doing and how it can translate to data driven decisions in the real world will continue to be in demand. I would argue that it will be even more in demand than today, as business orientated people will be further away from the original concepts of AI and machine learning.
  2. Business skill: What should we look for in the data? Where to look for it? What kind of opportunities are available to the business (pricing, retention, etc)? AI won’t be able to optimize a business from A to Z, and it will need direction. The Data Scientist of the future who has a great business acumen will be in demand.

I truly believe that soft-skills and the capacity to supervise AI rather than doing the actual work will be the future for most Data Scientists.

How can You Prepare Yourself?

Most Data Scientists as well as aspiring ones today focus on theoretical and technical skills. This is definitely important in the current job market, but my recommendation is to keep time and energy to focus on consultancy and business skills as well.

At work, try to spend more time with end-users, practice answering their questions, and adapt to your audience. If you’re working on trivial business (not theoretical or technical) problems, try finding a business opportunity by yourself and exploiting it.

If you’re at school try explaining linear regression to your friends in Arts or Psychology. When they get it then it means you explained it well enough.

Conclusion

The current Data Science field will be segmented in the future: Data Science Research, and Applied Data Science. I believe automation will take over most of the theoretical and technical work of Applied Data Scientists. If you’re in that group or aspire to be, I recommend that you develop 2 skills in parallel to remain in demand in the future: Consultancy and Business skills.

Subscribe to Receive the Resources that I Use and Recommend – It’s Free

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.