Create Engaging Interactive Visualizations with Plotly Express
Written on
Introduction to Data Visualization
In today's data-driven world, the ability to explore, analyze, and convey insights from data is vital. While traditional spreadsheets and static charts have their roles, interactive visualizations empower audiences to engage with data in transformative ways.
Data visualization is an essential competency for data scientists, analysts, and anyone involved with data. Crafting clear, engaging, and interactive visual representations can reveal insights, present findings effectively, and influence decision-making. Among the various Python libraries available, Plotly Express distinguishes itself due to its ease of use, versatility, and capability to produce interactive plots ready for production with minimal coding effort.
Fortunately, for Python developers and data scientists, Plotly Express makes creating captivating interactive plots and dashboards simpler than ever. This high-level plotting library, built on the renowned Plotly graphing library, enables you to transform your data into professional-grade web visualizations with just a few lines of code.
In this guide, we will thoroughly explore Plotly Express, learning to leverage its features to craft a diverse array of stunning interactive visualizations. We will begin with fundamental plots such as scatter, line, and bar charts before advancing to more complex visualizations like 3D plots, multidimensional charts, statistical graphs, tile maps, and more. Throughout this journey, we will provide clear and concise Python code examples for each visualization type. Let’s dive in!
Key Features of Plotly Express
- Unified Access: Import plotly.express as px to access all plotting functions, datasets, and color scales conveniently.
- Intelligent Defaults: The library aims to infer sensible defaults for your plots based on the provided data, while still allowing for customization.
- Adaptable Inputs: It accommodates various data formats such as DataFrames, arrays, and dictionaries in both long-form and wide-form, adapting to your dataset's structure.
- Automated Configuration: Automatically generates traces and layout settings based on attributes like color, symbol, and line style, significantly reducing manual configuration.
- Effortless Labeling: Automatically assigns labels to axes, legends, and hover texts based on DataFrame columns, with options for customization.
- Quick Faceting: Easily create faceted subplots using parameters like facet_row, facet_col, and facet_col_wrap.
- Marginal Plots: Incorporate marginal distribution plots alongside the main plot using the marginal argument.
- Seamless Pandas Integration: Call 2D Cartesian PX functions directly on DataFrames with df.plot().
- Enhanced Features: Supports adding trendlines, creating animations, and utilizing WebGL for performance optimization.
Overall, Plotly Express offers a high-level, customizable API that streamlines the process of creating complex visualizations. Its intelligent defaults, flexible inputs, and automatic configuration allow you to express your visualizations succinctly while maintaining control. The inclusion of faceting, marginal plots, and Pandas integration simplifies common workflows compared to lower-level APIs.
Setting Up Plotly Express
Before we can begin crafting visualizations, we need to install Plotly Express (abbreviated as PX). You can install it using pip:
pip install plotly
We will also be using Pandas for data manipulation, so make sure to install it if you haven’t already:
pip install pandas
For our examples in this guide, we will primarily utilize Plotly's built-in datasets, which are excellent for learning and experimentation. In your own projects, you may import data from various sources, including files, databases, or real-time feeds.
Basic Visualization Techniques
Let’s explore some fundamental chart types that are the cornerstones of effective data visualization.
Scatter Plots
Scatter plots are useful for illustrating the relationship between two continuous variables, with each data point represented as a dot based on its X and Y coordinates. Here’s how to create a simple scatter plot using Plotly Express:
import plotly.express as px
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species")
fig.show()
This code loads the classic Iris dataset and generates a scatter plot with sepal width on the X-axis and sepal length on the Y-axis, with points colored by species. The resulting plot is interactive, enabling you to hover for more information, pan, and zoom.
Line Plots
Line plots illustrate how a variable evolves over time or another continuous variable. They resemble scatter plots but connect the points with lines. Here’s an example:
import plotly.express as px
df = px.data.gapminder().query("country=='Canada'")
fig = px.line(df, x="year", y="lifeExp", title='Life expectancy in Canada')
fig.show()
This code visualizes life expectancy in Canada over the years using data from the Gapminder dataset.
Area Plots
Similar to line plots, area plots fill the space beneath the line, often used to depict cumulative totals or stacked categories over time. Here’s an example of a stacked area plot:
import plotly.express as px
df = px.data.gapminder().query("continent=='Oceania'")
fig = px.area(df, x="year", y="pop", color="country", line_group="country")
fig.show()
This generates a stacked area plot displaying the population of countries in Oceania over time.
Bar Charts
Bar charts are a traditional method for comparing values across different categories. Plotly Express simplifies their creation:
import plotly.express as px
df = px.data.tips()
fig = px.bar(df, x="day", y="total_bill", color="smoker", barmode="group")
fig.show()
This code produces a grouped bar chart that compares total bills for smokers and non-smokers across different days.
Funnel Charts
Funnel charts are ideal for visualizing stages in a sales process or user flow. Here’s a simple example:
import plotly.express as px
data = dict(
number=[3000, 1500, 500, 100],
stage=["Website Visits", "Downloads", "Potential Customers", "Purchases"])
fig = px.funnel(data, x='number', y='stage')
fig.show()
This code generates a funnel chart indicating the number of users at each stage of a hypothetical sales process.
Timeline Charts
Timeline charts showcase a series of events over time. You can create them in Plotly Express using scatter plots with custom date formatting:
import plotly.express as px
import pandas as pd
df = pd.DataFrame([
dict(Task="Task A", Start='2023-01-01', Finish='2023-02-28'),
dict(Task="Task B", Start='2023-03-05', Finish='2023-04-15'),
dict(Task="Task C", Start='2023-02-20', Finish='2023-05-30')
])
fig = px.timeline(df, x_start="Start", x_end="Finish", y="Task", color="Task")
fig.show()
This code constructs a timeline representing the start and end dates of three distinct tasks.
Part-of-Whole Charts
Now, let’s examine charts that visualize how parts contribute to a whole.
#### Pie Charts
Pie charts effectively depict the percentage distribution of a whole into categories. Here’s how to create one:
import plotly.express as px
df = px.data.tips()
fig = px.pie(df, values='tip', names='day', title='Tips by Day')
fig.show()
This code generates a pie chart illustrating the breakdown of tips by day of the week.
#### Sunburst Charts
Sunburst charts represent hierarchical data as nested rings, with each ring indicating a level in the hierarchy. Here’s an example:
import plotly.express as px
df = px.data.tips()
fig = px.sunburst(df, path=['day', 'time', 'sex'], values='total_bill')
fig.show()
This code creates a sunburst chart with rings representing day, time, and sex, sized by total bill amount.
#### Treemap Charts
Treemaps visualize hierarchical data using nested rectangles. Here’s an example:
import plotly.express as px
df = px.data.gapminder().query("year == 2007")
fig = px.treemap(df, path=['continent', 'country'], values='pop')
fig.show()
This code produces a treemap displaying the population of each country in 2007, organized by continent.
#### Icicle Charts
Icicle charts are similar to treemaps but use a linear layout instead of nesting. Here’s how to create one:
import plotly.express as px
df = px.data.tips()
fig = px.icicle(df, path=['day', 'time', 'sex'], values='total_bill')
fig.show()
This example displays a similar visualization as the sunburst but in the form of an icicle chart.
1D Distribution Charts
These charts visualize the distribution of a single variable.
#### Histograms
Histograms depict the distribution of a numerical variable by dividing the value range into bins and counting the number of data points in each bin. Here’s how to create one:
import plotly.express as px
df = px.data.tips()
fig = px.histogram(df, x="total_bill")
fig.show()
This code generates a histogram of the total bill amounts from the tips dataset.
#### Box Plots
Box plots summarize a variable's distribution by displaying the median, quartiles, and outliers. Here’s an example:
import plotly.express as px
df = px.data.tips()
fig = px.box(df, x="day", y="total_bill", color="smoker")
fig.show()
This code creates a box plot of total bill amounts by day, with separate boxes for smokers and non-smokers.
#### Violin Plots
Violin plots are similar to box plots but additionally illustrate the probability density of the data at various values. Here’s how to create one:
import plotly.express as px
df = px.data.tips()
fig = px.violin(df, y="tip", x="smoker", color="sex", box=True, points="all")
fig.show()
This code produces a violin plot of tip amounts, differentiated by smoker status and sex, with an overlay of box plots and individual data points.
#### Strip Charts
Strip charts (or jitter plots) display individual data points in a distribution, adding random noise to their positions to prevent overplotting. Here’s an example:
import plotly.express as px
df = px.data.tips()
fig = px.strip(df, x="total_bill", y="time", orientation="h", color="smoker")
fig.show()
This code generates a horizontal strip chart of total bill amounts by time of day, colored by smoker status.
#### ECDF Plots
ECDF (Empirical Cumulative Distribution Function) plots illustrate the proportion of data points less than or equal to each value. Here’s how to create one:
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="time")
fig.show()
This code creates an ECDF plot of total bill amounts, with separate curves for each time of day.
2D Distribution Charts
These charts visualize the joint distribution of two variables.
#### Density Heatmaps
Density heatmaps use color to depict the joint probability density of two variables. Here’s an example:
import plotly.express as px
df = px.data.iris()
fig = px.density_heatmap(df, x="sepal_width", y="sepal_length", marginal_x="histogram", marginal_y="histogram")
fig.show()
This code generates a density heatmap of sepal width versus sepal length from the Iris dataset, including marginal histograms.
#### Density Contour Plots
Density contour plots are akin to heatmaps but utilize contour lines to represent density rather than color. Here’s how to create one:
import plotly.express as px
df = px.data.iris()
fig = px.density_contour(df, x="sepal_width", y="sepal_length")
fig.show()
This code creates a density contour plot for sepal width versus sepal length.
Displaying Images
Plotly Express can also be used to display images with the imshow function:
import plotly.express as px
import numpy as np
img_rgb = np.array([[[255, 0, 0], [0, 255, 0], [0, 0, 255]],
[[0, 255, 0], [0, 0, 255], [255, 0, 0]]], dtype=np.uint8)
fig = px.imshow(img_rgb)
fig.show()
This code creates a simple RGB image and displays it using imshow.
3D Visualization Techniques
Plotly Express simplifies the creation of interactive 3D visualizations.
#### 3D Scatter Plots
3D scatter plots extend regular scatter plots into three dimensions. Here’s an example:
import plotly.express as px
df = px.data.iris()
fig = px.scatter_3d(df, x='sepal_length', y='sepal_width', z='petal_width', color='species')
fig.show()
This code generates a 3D scatter plot of the Iris dataset, with axes representing sepal length, width, and petal width, colored by species.
#### 3D Line Plots
3D line plots connect data points in three dimensions. Here’s an example:
import plotly.express as px
df = px.data.gapminder()
fig = px.line_3d(df, x="gdpPercap", y="pop", z="year", color="continent")
fig.show()
This code creates a 3D line plot of the Gapminder data, illustrating GDP per capita, population, and year, with lines colored by continent.
Multidimensional Visualization
These charts help visualize high-dimensional data.
#### Scatter Matrix Plots
Scatter matrix plots (also known as splom or pairs plots) display all pairwise scatter plots of variables in a dataset. Here’s an example:
import plotly.express as px
df = px.data.iris()
fig = px.scatter_matrix(df, dimensions=["sepal_width", "sepal_length", "petal_width", "petal_length"], color="species")
fig.show()
This code produces a scatter matrix for the Iris dataset, showcasing plots for each dimension pair, colored by species.
#### Parallel Coordinates Plots
Parallel coordinates plots represent multivariate data with a separate axis for each variable, connecting values for each observation with lines. Here’s an example:
import plotly.express as px
df = px.data.iris()
fig = px.parallel_coordinates(df, color="species_id", labels={"species_id": "Species",
"sepal_width": "Sepal Width", "sepal_length": "Sepal Length",
"petal_width": "Petal Width", "petal_length": "Petal Length"},
color_continuous_scale=px.colors.diverging.Tealrose)
fig.show()
This code generates a parallel coordinates plot of the Iris dataset, with lines colored by species.
#### Parallel Categories Plots
Parallel categories plots are similar to parallel coordinates plots but are designed for categorical variables. Here’s an example:
import plotly.express as px
df = px.data.tips()
fig = px.parallel_categories(df, color="size", color_continuous_scale=px.colors.sequential.Inferno)
fig.show()
This code creates a parallel categories plot of the tips dataset, with lines colored based on party size.
Geographic Visualization
Plotly Express can also generate geographic plots using geographic coordinates.
#### Scatter Geo
Scatter geo plots display points on a map. Here’s an example:
import plotly.express as px
df = px.data.gapminder().query("year == 2007")
fig = px.scatter_geo(df, locations="iso_alpha", size="pop", color="continent", hover_name="country")
fig.show()
This code produces a scatter geo plot of the 2007 Gapminder data, positioning points by country and sizing them based on population.
#### Line Geo
Line geo plots illustrate lines on a geographic map. Here’s an example:
import plotly.express as px
df = px.data.gapminder().query("year == 2007")
fig = px.line_geo(df, locations="iso_alpha", color="continent", projection="orthographic")
fig.show()
This code creates a line geo plot of the 2007 Gapminder data, connecting countries by continent.
#### Choropleth
Choropleth plots color geographic regions based on a variable. Here’s an example:
import plotly.express as px
df = px.data.gapminder().query("year == 2007")
fig = px.choropleth(df, locations="iso_alpha", color="lifeExp", hover_name="country",
color_continuous_scale=px.colors.sequential.Plasma)
fig.show()
This code generates a choropleth plot of the 2007 Gapminder data, coloring countries by life expectancy.
Polar Charts
Polar charts use polar coordinates to visualize data.
#### Polar Scatter Plots
Polar scatter plots display points using angles and radii. Here’s an example:
import plotly.express as px
df = px.data.wind()
fig = px.scatter_polar(df, r="frequency", theta="direction", color="strength", symbol="strength",
color_discrete_sequence=px.colors.sequential.Plasma_r)
fig.show()
This code creates a polar scatter plot of wind data, with angles representing direction, radius indicating frequency, and color/symbol denoting strength.
#### Polar Line Plots
Polar line plots connect points using angles and radii. Here’s an example:
import plotly.express as px
df = px.data.wind()
fig = px.line_polar(df, r="frequency", theta="direction", color="strength", line_close=True,
color_discrete_sequence=px.colors.sequential.Plasma_r)
fig.show()
This code generates a polar line plot of the same wind data, connecting lines for each strength level.
#### Polar Bar Charts
Polar bar charts display bars using angles and radii. Here’s an example:
import plotly.express as px
df = px.data.wind()
fig = px.bar_polar(df, r="frequency", theta="direction", color="strength", template="plotly_dark",
color_discrete_sequence= px.colors.sequential.Plasma_r)
fig.show()
This code creates a polar bar chart of the wind data, where bars represent frequency and color indicates strength.
Ternary Charts
Ternary charts visualize data on a triangular grid.
#### Ternary Scatter Plots
Ternary scatter plots display points based on three variables that sum to a constant. Here’s an example:
import plotly.express as px
df = px.data.election()
fig = px.scatter_ternary(df, a="Joly", b="Coderre", c="Bergeron", hover_name="district",
color="winner", size="total", size_max=15,
color_discrete_map = {"Joly": "blue", "Bergeron": "green", "Coderre":"red"} )
fig.show()
This code generates a ternary scatter plot of election data, positioning points according to vote percentages for three candidates.
#### Ternary Line Plots
Ternary line plots connect points on a ternary grid. Here’s an example:
import plotly.express as px
df = px.data.election()
fig = px.line_ternary(df, a="Joly", b="Coderre", c="Bergeron", line_dash='winner')
fig.show()
This code creates a ternary line plot of the election data, with lines styled based on the winning candidate in each district.
Key Takeaways
- Plotly Express supports a wide range of foundational chart types, including scatter, line, area, and bar plots.
- It facilitates the creation of statistical charts such as histograms, box plots, and violin plots.
- It allows for visualizing parts of a whole through pie, sunburst, treemap, and icicle charts.
- It provides tools for 2D distribution visualization using contour and density heatmaps.
- It enables multidimensional representations with scatter matrices, parallel coordinates, and parallel categories plots.
- It can create interactive maps using Mapbox and geographic outline charts.
- It supports non-Cartesian coordinate systems with polar and ternary charts.
The most effective way to learn Plotly Express is through hands-on experience with your own datasets. Experiment with various chart types, customizations, and configurations to discover the best method to narrate your data's story. With Plotly Express in your arsenal, you will be well-prepared to craft captivating interactive visualizations that breathe life into your data.
Conclusion
Plotly Express is a powerful and user-friendly Python library that streamlines the creation of interactive, publication-quality visualizations. Its high-level, consistent API enables data scientists and developers to quickly explore and present their data, while its integration with the broader Plotly ecosystem allows for seamless development of comprehensive data applications. With its vast array of plot types, intelligent default settings, and ability to manage diverse data formats, Plotly Express is an invaluable resource for anyone working with data in Python. Whether you are a seasoned data professional or new to data visualization, incorporating Plotly Express into your toolkit is highly recommended.
In Plain English 🚀
Thank you for being part of the In Plain English community! Before you leave, make sure to clap and follow the writer ️👏️️. Follow us on: X | LinkedIn | YouTube | Discord | Newsletter. Explore our other platforms: Stackademic | CoFeed | Venture | Cubed. More content available at PlainEnglish.io.
This video titled "Generate Beautiful & Interactive Plots Using Plotly: Python for Mechanical Engineers" demonstrates how to create visually appealing and interactive plots using Plotly in Python.
In this video, "How to use Plotly Express to create professional graphs in minutes!" viewers will learn to quickly generate high-quality graphs using Plotly Express.