Mistake #1 - Do It All By Yourself
It took Uber about 18 months to develop its Data & Analytics platform in order to launch Ubereats. This illustrates how complex it is to build an end-to-end platform to deliver Big Data projects, no matter how big the company is. The main risk for a smaller structure is to have a late ROI (as you need the platform to be ready to address use cases), which can threaten the whole project.
Mistake #2 - Use the Shadow IT Approach
This is one of the most common issue. “Shadow IT” means setting up a Big Data / AI project without consulting the IT department. The IT is neither informed nor involved, and often ends up blocking the projects deployment. What usually happens is your project does not comply with the IT infrastructure and security criteria and therefore, gets shut down. You then have to start all over again.
Mistake #3 - "Bunkerize" Your Data Lake
New regulations such as GDPR led organizations to be more cautious regarding the way they use personal data, which is reassuring for individuals on one hand, but can be a drag on a Data & Analytics project on the other. Data lakes are more and more closed (access restrictions, personal data protection constraints…) in order to ensure security. This means less incoming and outgoing data and, therefore, less use cases. And obviously, no data initiative is possible without all that.
Mistake #4 - Lack Collaboration
As shown by “shadow IT”, departments do not always share information. It is mostly true when it comes to the IT and the Data & Analytics teams. They come from different background, have their own way to get the job done and do not even work with the same tools. The Data & Analytics team will promote the agile and test & learn approaches when the IT has to deal with robust standards and processes for obvious security reasons. In some cases, it can even lead developers to completely rewrite Data Scientists code, which is a huge waste of time.
Mistake #5 - Choose Craft Approaches
Technologies used differ from experimentation process to operationalization process. Therefore, there are so-called Data Science technologies, and technologies that will mostly be used during implementation. For instance, Python has advanced modeling libraries (Scikit learn) that Java has not, which makes even more complex:
- model reproducibility for the rest of the company
As you can see, there are several mistakes that prevent Data Lab projects from being a success. If you want to find out how our customers were able to avoid these pitfalls with our DataOps platform, feel free to request a demo of our solution.