
15 minute read
The Data Scientist Magazine - Issue 4
CHARTING THE EVOLUTION OF TALENT IN DATA SCIENCE AT THE EVENT HORIZON OF

IN RECENT MONTHS, WE’VE WITNESSED A SEISMIC SHIFT IN ARTIFICIAL INTELLIGENCE.
This transformation, resembling a grand renaissance, has been sparked by large language models (LLMs) like OpenAI’s GPT series. What was once considered simple pattern prediction has now unveiled emergent capabilities that have taken centre stage, revolutionising our conception of AI’s potential. The prospect of achieving Artificial General Intelligence (AGI) has rocketed skyward, setting us on an accelerated path of adaptation and adoption that has left many astounded, eager, and even fearful.
By LIN WANG
As a people leader interested in the developing talent of Data Scientists, I’ve observed a wave of change sweeping across the tech sector. Companies are in full sprint, vying to stay ahead of the technological curve. In the scramble to adapt, however, there’s a blind spot emerging: the crucial element of human potential seems to be getting sidelined.
This brings us to a critical juncture. With the rapid pace of AI evolution, what does the future hold for our Data Scientists? Through the lens of this article, I aim to offer my perspectives on how the roles and responsibilities of Data Scientists may evolve in the coming years. I invite you to join me as we explore this exciting future landscape, teeming with promises and opportunities.
BACK TO THE FUTURE
Let’s take a moment to peel back the layers of the world of Data Science in an industry, which, at its heart, is dedicated to solving problems and driving tangible outcomes. If you’re peering into this world from the outside, you might imagine a Data Scientist’s day is filled with intellectual battles over complex problems, meditating over the merits of Data Science techniques, and crafting the perfect implementation tactics.
However, the reality can often be far less glamorous and somewhat surprising to those not entrenched in the field. The truth is that Data Scientists often find themselves more like explorers in a vast wilderness, dedicating substantial time to the arduous task of hunting, gathering, and refining the raw materials of their craft: the data itself. They then spend hours coding and troubleshooting to extract the insights before finally weaving those into stories with narratives that non-data savvy stakeholders can understand and act upon.
The advancements in AI might just be the game-changer we need to tackle these less-visible inefficiencies. They equip Data Scientists with powerful tools to harness their core competencies fully. They will have more time and focus on employing cutting-edge analytical methods to derive actionable insights and address real-world issues. This is the “future” many Data Scientists envisioned when starting their journeys, and we’re journeying back to that future now.
ENRICHING AND EXPANDING DATA SCIENCE CAREERS WITH AI
AI advancements are triggering significant productivity boosts and impact acceleration in Data Science. Let’s delve into a few recent AI-enabled innovations that illustrate this point.
Take, for instance, GitHub’s Copilot. This AI-powered
coding aide serves as a steadfast companion for every Data Scientist, offering instant code suggestions and considerably reducing their workload. Imagine the convenience of telling Copilot your coding objective in layman’s terms, and it responds with the necessary subroutines or functions. Of course, sanity checks remain crucial even when using AI. This isn’t science fiction - it’s the reality we are experiencing today. Several similar coding assistants are emerging, including DeepMind’s AlphaDev, which impressively identified sorting algorithms boasting a speed and scalability improvement of up to 20% compared to leading human-designed benchmarks. Such AI-enabled coding assistants empower our Data Scientists to dedicate more time to discovery and problem-solving, thus boosting their efficiency.
Let’s also consider the potential of a tool capable of swiftly skimming through lengthy reports or intricate technical documents, identifying key points to form hypotheses or spotlighting opportunities for system improvement. This is now feasible thanks to AI’s phenomenal prowess in summarising vast bodies of text. This area is rapidly growing, with paid and open-source options becoming available. Notable newcomers include Jasper (formerly Jarvis), a GPT-3 model-based tool adept at tackling generalised summarisation tasks. There’s also Scholarcy, tailored for academic use, including direct PDF ingestion capabilities. Scholarcy appears to operate based on a proprietary algorithm, albeit drawing inspiration from Google’s PageRank algorithm and ‘bottom-up attention’ research. While these tools may overlook nuances requiring deep domain knowledge, their abilities are continually improving. It’s just a matter of time before we have access to embedded summarisation tools for industrial settings capable of meeting requirements for IP capture and incorporating profound industrial knowledge. Such tools will assist Data Scientists in navigating information more promptly and efficiently.
AI’s transformative potential also impacts how insights are communicated and implemented. AIgenerated presentations and visuals enable Data Scientists to distill intricate insights into digestible narratives. For instance, Beautiful.ai provides a userfriendly platform for creating vibrant presentations, eliminating the need for meticulous crafting in PowerPoint. Another example is SlidesAI.io, integrated
LIN WANG
Companies are in full sprint, vying to stay ahead of the technological curve. In the scramble to adapt, however, there’s a blind spot emerging: the crucial element of human potential seems to be getting sidelined.
into the Google Docs ecosystem, making visually appealing slides easy to create. Granted, these tools focus more on the aesthetic aspect than the content, but just think about the potential when you pair these capabilities with AI’s text summarisation prowess, as previously mentioned.
Imagine a scenario where Data Scientists can articulate their findings to business stakeholders using the specific lingo or style that encourages understanding, support, and rapid implementation. This approach will undoubtedly expedite the journey from insight discovery to solution implementation.
These increasingly advanced AI-enabled tools are becoming more sophisticated and more widely accessible, which is an exciting development. We’re seeing an array of AI-powered tools integrating seamlessly into familiar software like Microsoft’s Office Suite, which now includes built-in AI features. The open-source world is also teeming with
groundbreaking innovations, drawing inspiration mainly from Meta’s recently “leaked” LLM model, known as LLaMA.
Rumours are starting to circulate that Meta may be looking into offering commercial licenses, which could open the door for companies to integrate AI into their operations natively. This development is exhilarating and signals a future where cutting-edge AI technology is not solely within reach of tech giants but is a shared resource available to all.
Indeed, we are on the brink of a new era. AI is helping Data Scientists not only return to their original mission but it’s also helping them unlock new opportunities. Rather than being confined to the analytical sidelines, Data Scientists are now stepping into strategic roles, spearheading business decision-making processes. AI acts as their navigation system, guiding them through uncharted territories toward a future teeming with promise and potential.
AI-ADAPTIVE DATA SCIENTISTS: SKILLS AND QUALITIES FOR THE NEW ERA
As we navigate this AI revolution, the role of a Data Scientist is undeniably shifting. Elements that were always vital are now spotlighted, while others previously in the shadows are stepping into the limelight. Living in the heart of this transformation, I’d like to discuss the evolving requirements and personal attributes that can help Data Scientists thrive in this AI-empowered era.
MULTIDISCIPLINARY EXPERTISE
Firstly, a broad understanding and expertise across various fields has become crucial to the role of the Data Scientist. It’s no longer enough to be proficient in just one area. Being limited to one domain could be the biggest hurdle moving forward. While industry-specific knowledge can be seen as a domain and holds significant importance for a Data Scientist (due to its role in setting constraints and charting practical implementations), I am explicitly highlighting traditional academic disciplines here. These include fields such as biology, physics, chemistry, etc., which extend beyond the core disciplines of Data Science; like statistics, mathematics, and computer science.
Let’s delve into a rather technical example within biology to demonstrate this point: consider the scenario of modelling gene functions. It becomes imperative to understand how genomic repeats factor into the model. When a DNA segment is repeated, it sometimes leads to null functions (often triggering gene silence, a
common organismal mechanism to combat viruses). At other times, it can enhance the function by duplicating essential genes, making the underlying function more robust and diverse. This intricate understanding plays a crucial role when it comes to accurate modelling. For instance, how much weight should we attribute to these repetitive observations in our model? How do we tune the hyperparameters within a deep neural network to capture these complexities? This example underscores the need for a profound understanding of the specific domain (in this case, biology) and Data Science techniques to effectively excel in our roles.
If my earlier points seemed obvious, I’d like to offer a deeper insight into the hidden significance of profound domain knowledge in the context of the AI revolution. As we delve into the world of Large Language Models (LLMs), a key objective is to align AI-generated recommendations or solutions with the benefits and intentions of human users. This is known as the alignment problem. While we can mitigate the alignment problem to an extent with human feedback and reinforced learning approaches, it doesn’t address the underlying issue: we often don’t understand how these recommendations are made and whether they could potentially lead to unforeseen and harmful outcomes. Using my earlier example of DNA duplication, what if an AI model considered all DNA duplications detrimental or useless? How could we be sure that the model wouldn’t
make incorrect recommendations based on this assumption?
I’m convinced that a profound understanding of the domain, combined with a thoughtful application of this knowledge when employing AI tools, arms us with the necessary tools to ensure our models are not only interpretable but also capable of making sound decisions. More importantly, they are aligned with our overarching goals. This focus on multidisciplinary expertise will transition from being a ‘nice-to-have’ attribute to an indispensable requirement in the age of AI, underscoring its transformative potential.
EMBRACING HUMILITY AND ADAPTABILITY
The next crucial quality for Data Scientists to hone is a humble readiness to embrace change. This trait complements the knowledge depth of domain experts, a group that includes many of our Data Scientists with advanced degrees. It’s natural to take pride in reaching the zenith of one’s field. However, this pride can sometimes evolve into arrogance, leading to skepticism when novel methods, understandings, or perspectives arise. Given the rapid advancements in AI, these moments of surprise and potential shifts are only set to increase.
I vividly remember my initial skepticism toward ChatGPT and its earlier versions when they first became accessible. My LinkedIn posts from that time reveal this skepticism as I dismissed it as mere “simple pattern recognition.” Only later did I realise I had underestimated its significant emergent abilities. I’ve since openly shared this learning journey. While I’ve never considered myself overly arrogant, this experience was humbling. New breakthroughs can seem like magic at first. Without a willingness to understand and adapt, these innovations will remain misunderstood — like magic, captivating but not taken seriously. This mindset ultimately holds us back, preventing us from realising the full potential of AI.
new field of knowledge or pursuing another degree? Remaining relevant requires an ongoing commitment to learn and adapt, and Data Scientists find themselves at the epicentre of these new demands. We stand as the orchestrators of this evolution, but we also risk being the most significantly impacted unless we welcome and adopt new mindsets.
BEYOND AI: THE IMPORTANCE OF EMOTIONAL INTELLIGENCE AND HUMAN CONNECTION
An often-overlooked aspect in our field is the significance of Emotional Intelligence (otherwise known as emotional quotient or EQ), particularly in the industry setting. Historically, Data Science has been a highly technical domain where practitioners take pride in resolving complex and challenging problems. The spotlight has rarely been on the necessity of EQ for a Data Scientist, but this needs to shift. Although EQ is not a unique requirement for Data Scientists, it will become an essential prerequisite. This human-centred attribute possesses the greatest resilience against the disruptions brought on by the AI revolution.
EQ is of paramount importance in Data Science for several reasons. Firstly, EQ goes beyond understanding numbers and statistics; it involves grasping the human influences behind these figures. Take the stock market as a prime example. If you base your financial models solely on fundamentals such as profit margins and market share, you will likely underperform in the long run. Why? Because stock prices are primarily driven by human decisions, which can often be irrational. They are swayed by word of mouth, sentiment, or even simple human errors.
A striking instance of this occurred in January 2021 when investors mistakenly bought shares in Signal Advance, a small components manufacturer, which led to an over 5000% increase in its stock price at the time. These buyers were under the false impression that they were investing in Signal, the encrypted messaging service. This mix-up happened following a tweet by Elon Musk encouraging the use of Signal due to privacy concerns about WhatsApp, leading to significant confusion.
Following the path of continuous learning and openness to new ideas, we must also overcome our inherent resistance to adopting new methods and altering our established ways of working. Are we fully using AI-based coding assistants like GitHub’s Copilot in our day-to-day Data Science tasks? Have we integrated the innovative sorting algorithm discovered by AlphaDev into our pipelines? Have we considered delving into a
This event underscores the crucial role of EQ in a field like Data Science. To be truly effective, Data Scientists must work closely with others to identify information, devise solutions, and understand different perspectives. This requires not just technical skills but also the ability to read others’ thoughts and preferences and empathise with them. Developing EQ will lead to better decision-making and a more substantial business impact.
We are on the brink of a new era. AI is helping Data Scientists not only return to their original mission but it’s also helping them unlock new opportunities.
In conclusion, I often find myself addressing a recurring set of questions from my Data Science team“how can I get this person to listen to me?”, “where can I find this information?” or “why aren’t they responding to my emails?”. I wish I could offer a ground-breaking revelation as a solution, but the truth is much simpler. Data Science is more than science; technical expertise alone will never suffice, not just in our field but in any role in an increasingly interconnected and collaborative future. Creating and nurturing human connections is vital for our work, and it might be one of the few sanctuaries that allow us to flourish when Artificial General Intelligence (AGI) is fully realised.
As a Data Science lead and people manager at an agriculture company, I often recount one experience to my new hires. I take pride in the time I spent walking through corn fields alongside our commercial teams, breeders, and farmers. While some might wonder why I consider this experience so significant, and others may even think it’s pretentious, the truth is that it’s deeply intertwined with our work. It allows me to witness firsthand how decisions are made and how opinions form, but more importantly, it provides an opportunity to build trust and forge connections with key stakeholders. These relationships are the cornerstones that have defined the success of my career. Therefore, I encourage every Data Scientist to embrace the human factor, for these personal interactions genuinely make the difference.
PEEKING INTO THE SINGULARITY
The rise of AI is ushering Data Science back to its roots, empowering us to do what we are supposed to do, but with remarkable efficiency and precision. It is reshaping our careers in profound ways; we are evolving beyond just being number crunchers to become strategic thought leaders, facilitators of business decision-making, and navigators through unexplored
territories. Simultaneously, the AI revolution demands the development of new competencies. A multifaceted approach that combines interdisciplinary training with an empathetic understanding of human connections is now essential. It is as vital as mastering our mathematical and technical skills.

As we stand on the brink of this exciting new era, our vision only extends to the event horizon of Artificial General Intelligence (AGI). What lies beyond it - the Singularity - remains a mystery. It’s the equivalent of peering over the edge of a cliff, unsure of what’s underneath but filled with a sense of anticipation, maybe even fear of height.
Nevertheless, let’s approach this precipice with optimism and a willingness to adapt. Let’s harness the power of this AI revolution, guide it towards constructive paths, and increase our chances of creating a future that benefits all of humanity. The future of AI is, after all, a mirror reflecting our collective actions and decisions. Let’s ensure the image that emerges is one we can be proud of.
Based in the USA, LIN WANG is the Data Science Lead for Analytics for Bayer; a global company with core competencies in the life science fields of healthcare and agriculture.
The views and opinions expressed in this article are solely Lin’s and do not reflect his employer’s views, official policy, or position. The information presented is based on Lin’s personal research and understanding and should not be interpreted as definitive advice or recommendations.
Creating and nurturing human connections is vital for our work, and it might be one of the few sanctuaries that allow us to flourish when Artificial General Intelligence (AGI) is fully realised.