It wasn’t that long ago when big data was all the rage. However, the amount of data is growing at an alarming rate — to the point where it has overwhelmed many organizations. With so much data being generated, how do you corral it, analyze, and put it to good use?
“What can you do with the data? You can get it, you can look beyond it, or you can use it to fuel automation,” said Cassie Kozyrkov, chief decision scientist at Google, at the recent Rev3 conference held in New York.
That’s where data science — and data scientists — come into play. In fact, throngs of data scientists attended the Domino Data Lab-sponsored Rev3 conference to gain insight into their roles. For many attendees, Rev3 was one of their first — if not their first — in-person events since the COVID-19 pandemic shut down or moved conferences online.
A data scientist career is one of the more desirable careers in the United States, according to Glassdoor. After taking into consideration earning potential, overall job satisfaction, and number of job openings, Glassdoor found data scientist to be the third best job in America.
What Is Data Science — and What Is the Role of a Data Scientist?
Data science, in a nutshell, is the discipline of making data useful, according to Kozyrkov. However, Kjell Carlsson, head of data science strategy and evangelism at Domino Data Lab, finds defining the field a difficult task.
“When it comes to data science, it’s all over the place in terms of what people are referring to, and I think it’s second only to AI in terms of its nebulousness and the amount of the sheer breadth of things that it can accomplish depending on who you’re talking to,” Carlsson said. “So, it’s a tough topic to cover and tough topic to prepare for.”
“The term ‘data scientist’ means different things to different people,” added Nina Zumel, vice president of the data science practice at Wallaroo, which facilitates the last mile of organizations’ machine learning journey, getting ML into their production environment. “Part of the issue may be that as the term got popular, people in other data analysis-related professions may have started to call themselves ‘data scientists,’ because that’s what the job description asked for. This makes the term quite diffuse.”
For Zumel, who co-authored the book Practical Data Science with R, “A data scientist is someone who can extract useful patterns from data and turn those patterns into reproductible, automatable, data-driven decision processes.”
Toolmakers are stepping up to make data scientists’ jobs easier, Kozyrkov noted. Case in point, she said, is Domino Data Lab, which introduced at Rev3 its Domino 5.2 enterprise MLOps platform and previewed its hybrid Domino Nexus architecture. The company claims that Domino 5.2 will support data scientists by increasing flexibility for data science teams while reducing infrastructure costs and complexity, and Nexus will enable hybrid machine learning operations that run on-premises and across cloud providers, all controlled from a unified interface.
“Real advocates don’t just sell buzzwords; they care about making data scientists more effective, making their own job experience wonderful, awesome,” Kozyrkov said.
“Think about an Olympic skier,” she continued. “You wouldn’t ask an Olympic skier to spend most of their time trudging uphill. You would build a ski lift for that mountain of data science chores so data scientists’ time can be better spent.”
And that’s what companies such as Domino are doing. At Rev3, Domino CEO Nick Elprin told ITPro Today that a guiding principle for his company is accelerating “model velocity” — the rate at which companies can build and deploy machine learning models — and giving data scientists freedom and flexibility to use the resources they need, while giving IT control and security.
Help Wanted: More Data Scientists
Now that organizations have the tools, Kozyrkov asked, what do we need data scientists for? Is there anything more for them to do? The answer, of course, is yes, as evidenced by the number of job openings for data scientists.
According to Carlsson, his last search on the number of data scientists at the Global 2000 was about 98,000 — and they also had more than 60,000 job postings for data scientists.
In an article Carlsson penned for ITPro Today’s sister site InformationWeek, he wrote, “It is no exaggeration that every fast growing organization needs more data scientists. They are the crucial ingredient for turning raw data into innovative new products and services, and data-driven business transformation. … With so many companies competing for data science talent, taking an inclusive strategy to data science isn’t just good for business — and more ethical — it is a necessity. Data is not the strategic resource of the 21st century, data scientists are.”
Zumel agreed that there is a big need for more diversity in the field: “The world is diverse; decisions must be made to serve diverse populations, so the field should reflect that.”
Specialization Is Now the Name of the Data Science Game
What should organizations be looking for in data scientist candidates?
“It’s time to embrace specialization,” Kozyrkov said, even though data scientists still want to be the everything of data, despite that person being a myth.
Domino Data Lab’s Carlsson agreed. It’s no longer an everything-but-the-kitchen-sink kind of job. It’s rare to find a data scientist who does it all, he said.
“Every data science team that I’ve spoken to resembled The A-Team,” Carlsson said. “Mission Impossible is a better example. ‘For this mission, we recommend this individual because they have a background in makeup and other things and this person because they are a specialist on demolitions, etc.”
There’s no such thing as a standard data scientist, Carlsson added. “Few people have all the necessary skill sets,” he said. “And those who do are really expensive. They’re hard to keep, and they’re usually not very good at any of those individual components.”
Wallaroo’s Zumel said she has never considered data scientist to be a one-person job. “Data science projects have always been a collaboration between the business stakeholders, the data analysis specialists (data scientists), IT, and operations,” she said. “It’s always been vital that data scientists have empathy for their colleagues from other functional teams.”
That doesn’t mean data scientists must be experts in the business questions or in IT, for example. However, it does mean they should appreciate the issues in those fields and have communication skills to develop relationships with their teammates, Zumel said.
The Future Role of the Data Scientist
If data scientists are indeed, as Carlsson puts it, a strategic resource of the 21st century, what should we expect in the future?
Thanks to new tools being created to improve their jobs and give them more flexibility, data scientists — who once were spending 80% of their time doing data prep and data engineering — are now able to spend a considerable amount of their time on DevOps, Carlsson said.
When Zumel started her data scientist career, she said it was often about automating an analyst’s results. “Now data science involves a lot more concerns about operating at scale,” she said. “I expect this trend to continue, as more and more businesses start incorporating AI and ML processes into their operations.”
About the authorRick Dagley is senior editor at ITPro Today, covering IT operations and management, cloud computing, edge computing, software development and IT careers. Previously, he was a longtime editor at PCWeek/eWEEK, with stints at Computer Design and Telecommunications magazines before that.