Why You Should Use Mermaid to Create Diagrams as a Data Scientist
Use Python code as your Secret Weapon
python
Author
Jesus LM
Published
Mar, 2024
Abstract
In this article, we will show how you can create diagrams with code within Jupyter and stop using external diagramming graphical user interfaces (GUIs) like draw.io or Lucidchart.
As a data scientist, you’re constantly explaining complex concepts and relationships. Jupyter notebooks are great for this, but text and code can only go so far. Here’s where Mermaid comes in:
Simple and Clear
Mermaid uses a straightforward text-based syntax to create flowcharts, sequence diagrams, and entity relationship models. This makes it easier to learn and use compared to complex diagramming software.
Embedded in Jupyter
The beauty of Mermaid is that it works seamlessly within Jupyter notebooks. You can write your code, analysis, and explanations, and then include your Mermaid diagrams right alongside them. This creates a cohesive narrative for your audience.
Enhanced Communication
Visual representations improve understanding and retention of information. By incorporating Mermaid diagrams, you can effectively communicate data flows, machine learning models, and other complex ideas to both technical and non-technical audiences.
Interactive Magic
With some additional libraries, you can make your Mermaid diagrams interactive within Jupyter. This allows viewers to click on elements and see additional details or code.
Focus on Data Science
Mermaid keeps the focus on your data science work. You don’t need to spend hours designing fancy diagrams in separate software.
In summary, Mermaid empowers data scientists to create clear and concise diagrams directly within their Jupyter notebooks. This improves communication, enhances understanding, and keeps the focus on data analysis, all within code environment.
Data Science Cycle Diagram using Mermaid
Code
---title: Data Science Flow chart---graph LR%% Create a flow chart about data science processA[Define Problem] --> B[(Data Collection)]B --> C([Exploratory Data Analysis])C --> D{Data Cleaning & Preprocessing}D --> E[Feature Engineering]E --> F[Model Building & Training]F --> G([Model Evaluation])G --> H[Deployment]G --> I[Refine Model]I --> D
---
title: Data Science Flow chart
---
graph LR
%% Create a flow chart about data science process
A[Define Problem] --> B[(Data Collection)]
B --> C([Exploratory Data Analysis])
C --> D{Data Cleaning & Preprocessing}
D --> E[Feature Engineering]
E --> F[Model Building & Training]
F --> G([Model Evaluation])
G --> H[Deployment]
G --> I[Refine Model]
I --> D
sequenceDiagram
ParticipantA->ParticipantB: Line
ParticipantB-->ParticipantC: Dotted line
ParticipantA->>ParticipantB: Line arrow
ParticipantA-->>ParticipantC: Dotted line arrow
ParticipantB->>ParticipantB: Rounded arrow
ParticipantC-->>ParticipantC: Dotted rounded arrow
Mermaid is a popular markdown-like syntax for generating diagrams and charts using Markdown. Think of it as a way to describe your diagrams in a human-readable format, which Mermaid then renders into a visual representation.
Mermaid Python acts as a bridge, allowing you to generate these Mermaid diagrams within your Python environment and even embed them in your Jupyter notebooks or web applications.
Mermaid Python also allows you to customize the appearance of your diagrams, add styling, and even integrate with other Python libraries. You can dynamically generate diagrams based on your data, making it a powerful tool for data visualization and exploration.