Tecknoworks Blog

A Microsoft Fabric Journey – Part II

The workflow of a data scientist

Welcome back to our insightful series on Microsoft Fabric, a groundbreaking platform designed to revolutionize your approach to data management, data analytics, and business intelligence. After our first article “A Microsoft Fabric Journey – Part I” where we journeyed through the broad capabilities and benefits of Microsoft Fabric, it’s time to dive a little deeper. Today, we take you through the data scientist’s workflow, and the extensive impact Microsoft Fabric can have on the everyday tasks of these vital contributors to our data-driven world.

Introducing “A Microsoft Fabric Journey – Part II: The workflow of a data scientist” with Maria Carmen Lesan, our BI Engineer and Data Analyst. In this article, Maria will share her perspective on how Microsoft Fabric streamlines collaboration and facilitates a fluid work experience for teams comprised of data engineers, data scientists, and data analysts.

Microsoft Fabric, from a data science perspective

Microsoft Fabric facilitates a fluid work experience when working in teams made up of data engineers, data scientists and data analysts. From my perspective the collaboration is packaged as a core-satellite interface, where data can be centralized and then used under various roles.

As a data scientist, you often depend on the interaction with the pre-processing pipelines created by data engineers. You need to connect easily to the data set to perform various explorations and analyses, iteratively build models, perform experiments, and extract valuable insights.

Fabric’s workflow in a nutshell

From the analysis and pre-processing of data to the creation of models, experiments and to obtaining insights, Microsoft managed to assemble a seamless experience.

Data discovery and pre-processing is now easier with Notebooks that easily connect with Lakehouse, where the data can be passed through pipelines that create the input for the machine learning models. Users can make the most out of their data by leveraging Apache Spark and Python. Predictions can then be easily transferred through OneLake to be consumed by Power BI reports.

Therefore, the entire workflow of a Data Scientist is simplified.

My personal favorite tool is the Data Wrangler. It is an industry recognized fact that a data scientist spends 80% of the time cleaning and preparing the data and 20% of the time building models. With the Data Wrangler tool, the time for cleaning and extracting statistics can decrease exponentially. Without writing any code, you can use the interface to perform actions such as one-hot encoding, fill missing values and many more. After previewing and applying the operations in the visual editor, the tool outputs the function in the notebook, and you can continue with the non-tedious tasks.

This is not all! Many other capabilities are in preview, and I am excited to test the various APIs that Microsoft is offering in order to make the work of Data Scientists much more exciting and non-repetitive.

Conclusion

We trust the experience of our tech-savvy colleagues, and you should do it too. Don’t allow your data to languish in silos – integrate it, analyze it, and visualize it with the power of AI. Start your journey with Microsoft Fabric today and rely on us as your trusted partner for guidance, implementation, or consultancy. Ready to overcome your data challenges and unlock the full potential of your data? Write to us and we’ll do it together!

Unlock the Power of Your Data Today

Ready to take your business to the next level?