Best R libraries for data science

R is a langage that has been created to manipulate data. This means that R has a lot of built-in tools and functions such as data frames, vectors, matrices and decision trees out of the box to cover your basic needs of data science and even machine learning.  Considering this, it might be tempting to […]

What’s Computer Vision?

This week, we are delighted to chat with Augustin Marty, CEO of Deepomatic. It is a software editor specializing in computer vision. How does computer vision support us in our daily tasks? Find out more here! Tell us the story of Deepomatic Deepomatic is a company that grew out of a company called Smile, which […]

How to industrialize a project with the GitHub CI / CD?

This week, Julien Fricou, Data Engineer at Saagie, was able to ask Alain Hélaïli, Principal Solutions Engineer at GitHub, about the use of the platform. The GitHub CI / CD will hold no secrets for you! How to create a workflow (yaml file, graphic editor)? With GitHub Actions, we decided that the philosophy would be […]

What is overfitting and how to solve it in machine learning?

This article explains the phenomenon of overfitting in data science. It is one of the most recurrent problems in machine learning. We give you some clues to detect it, to overcome it, and to make your predictions with precision. A definition of overfitting You have probably already experienced, in the age of big data and […]

How to Easily Schedule Jobs with Apache Airflow?

This article is intended for both Airflow beginners and veterans and aims to present the fundamental objects of this technology as well as its interfacing with Saagie’s DataOps platform. We are not going to explain to you again how to create a Directed Acyclic Graph (commonly called DAG) or how to plan them. Indeed, there […]

What is MLOps ?

The recent excitement around data science, and big data, has enabled the development of an extremely rich and dynamic ecosystem around the analysis of collected data. Open source tools, which are increasingly easy to use, are enabling many organizations to start analyzing their data. However, the multiplication of data projects and algorithms has also brought […]

What are the keys to launch your data project?

Is it possible to deploy a data project, from scoping to large-scale deployment, in 10 weeks? Let’s take a closer look at the keys to accelerating this type of project, which can sometimes take up to 18 months to generate value. Before starting anything, it is essential to know fairly quickly whether there is a […]

Which technologies for your data projects?

You can easily get lost in the data technology ecosystem. The technological offer in data management being very (too?!) rich, many solutions are available to you according to your needs, data sources, industries, infrastructures, skills, technological situation? This is why we present you with a review and advice on how to choose your analysis tools. […]

How to Manage Machine Learning Deployment?

In this article, you will learn on how to deploy Machine Learning in Agile way to support your data projects. Here are 5 steps to keep in mind when addressing this kind of projects. Machine Learning Deployment Should be Managed as a Project When we think about Machine Learning deployment, we often think just about the […]

What is Natural Language Processing?

Have you ever wondered how your phone could possibly be able to understand what you are saying? Has this brainless pile of metal and plastic acquired the ability to talk with humans? If you already spend time playing with Siri, OK Google or Cortana, trying to fool them with some convoluted questions, you got an […]

How to Deploy a Machine Learning Model?

This article invites you on a short tour of how to go from exploration to production when working with Machine Learning models. What are the major stages of ML models life cycle? In the last part of the article, we will show an example of architecture based on Docker compose and hosted in the cloud to deploy your […]

Agile Data Science: the Way to Meet Business Success!

Applying methods from Agile software development to Data Science projects, is it only possible? This is a question we want to explore in this article. To set the scene, let’s consider the following cartoon as an example: In this typical situation, the Data Scientist is excited and focused on improving the predictive power of his models while the […]