Data scientists spend lots of time doing stuff they don't enjoy, but they still love their jobs
Data scientists spend a lot of time doing things they don't like, such as sorting out problems with unprocessed information, but they still love their jobs according to a new survey.
The second annual Data Science report from data enrichment platform CrowdFlower shows that there’s a perceived shortage of data scientists, with 83 percent saying there aren’t enough to go around, up from 79 percent last year.
The results of asking how data scientists spend their time are revealing to. They spend 60 percent of their time acting as "digital janitors" cleaning and organizing data prior to processing. Only nine percent of their time is spent mining for patterns and only four percent building algorithms, the sort of tasks that we think of data scientists performing.
When asked which part of the job they enjoyed least, 57 percent named the data wrangling aspect of cleaning and organizing information. Collecting data sets was cited by 21 percent. The tasks they do the most are therefore the ones they get least enjoyment from.
Yet despite this data scientists overwhelmingly happy in their work. When asked to rank how happy they felt in their current position on a simple five point scale, 35 percent gave it a five and 47 percent a four, meaning that over 80 percent like their jobs.
The survey also asked respondents if they felt they had the right tools to do their jobs. Just 14 percent disagreed, indicating that enterprises are committed to giving data scientists what they need to succeed.
When asked about the skills that are most in demand, SQL came out top on 56 percent, followed by big data favorite Hadoop on 49 percent, Python on 39 percent and Java on 36 percent.
The report concludes, "As more and more organizations adopt data as a key driver of decision making, the importance of streamlined, well-oiled data science teams is going to remain paramount. But the current status quo probably isn't sustainable. On the one hand, we see a shortage of data scientists while on the other, they’re spending too much time cleaning and munging data. This is time that could be much better served doing predictive analysis and building out machine learning practices".
You can find out more about the report's findings on the CrowdFlower blog.
Photo Credit: Sergey Nivens / Shutterstock