Lesson 3: Data Visualization

Instructor: Yuki Oyama, Prprnya
The Christian F. Weichman Department of Chemistry, Lastoria Royal College of Science

This material is licensed under CC BY-NC-SA 4.0

In this lesson, we will learn how to visualize data using Matplotlib, which is a powerful Python library for creating static, animated, and interactive visualizations in Python.

Before we start, let’s import numpy again.

import numpy as np

Your First Plot Using Matplotlib!¶

Now, we import the matplotlib library. We will use the pyplot module from matplotlib, i.e., matplotlib.pyplot. Like numpy, we can choose a brief alias for it:

import matplotlib.pyplot as plt

Are you ready? Aye, aye, captain! Let’s make our first plot—the Morse potential between two atoms. If you are not familiar with the Morse potential, you can click the link before, or I will give you a brief introduction.

We all know (you should know!) that we usually use the harmonic oscillator to describe a chemical bond. In the harmonic oscillator model, the potential energy is given by:

V_\text{harmonic}(r) = \frac{1}{2} k \left( r - r_0 \right)^2

(1)

where $k$ is the spring constant, $r_0$ is the equilibrium distance, and $r$ is the distance between the two atoms. This model can account for bond vibrations but lacks accuracy in describing bond dissociation. You see that as $r\to\infty$ , the potential energy becomes infinite, but that’s unreal. If two atoms are separated by a large distance, the chemical bond will break down, and the potential energy will become zero. To address this feature, Philip M. Morse proposed a new potential, which later named by himself as the Morse potential:

V_\text{Morse}(r) = D_e \left( 1 - e^{-a \left( r - r_0 \right)} \right)^2

(2)

where $D_e$ is the dissociation energy, $a$ is the spring constant, and $r_0$ is the equilibrium distance.

We are quite familiar with the appearance of the harmonic potential, but what about the Morse potential? Let’s see what it looks like. First, we define the Morse potential as a function:

def morse(r, De, a, r0):
    return De * (1 - np.exp(-a * (r - r0)))**2

It’s time to use pyplot! Run the code below to generate a plot of the Morse potential between two hydrogen atoms in a $\ce{H2}$ molecule:

De = 4.948  # eV
a = 1.905  # 1/Å
r0 = 0.74144  # Å

r = np.linspace(0, r0 + 2.0, 500)  # Å
V_morse = morse(r, De, a, r0)

plt.plot(r, V_morse)
plt.show()

(Data was taken and processed from “Spectroscopic Constants of Diatomic Molecules” in CRC Handbook of Chemistry and Physics, 106th Edition (Internet Version 2025), John R. Rumble, ed., CRC Press/Taylor & Francis, Boca Raton, FL.)

See how nicely the Morse potential shows bond dissociation as $r\to\infty$ ?

Generally, to make a 2D plot, you just need two arrays: one for the $x$ -axis and one for the $y$ -axis. Then you call the plot function from plt, giving it the $x$ data as the first argument and the $y$ data as the second. Finally, use the show function to display the plot.

Linestyles, Markers, and Colors¶

Linestyles and Linewidths¶

In the plot above, matplotlib uses the default linestyle, which is a solid line. We can change the linestyle by using the linestyle keyword argument. For example, we can use dashed lines by:

plt.plot(r, V_morse, linestyle='--')
plt.show()

In the code above, we used the linestyle keyword argument to specify the line style. Make sure to use quotation marks around the line style name. Some of the commonly used linestyles are:

Name	Shortcut
`solid`	`-`
`dotted`	`:`
`dashed`	`--`
`dashdot`	`-.`

There are many more linestyles available in matplotlib. You can find them in the Matplotlib Documentation - Linestyles.

Sometimes we want to change the linewidth. For example, we can use a thicker dotted line by:

plt.plot(r, V_morse, linestyle=':', linewidth=3)
plt.show()

Markers¶

We wouldn’t see markers in the plot above because we didn’t specify any markers. In that case, matplotlib interprets to not use any markers, but we can specify markers by using the marker keyword argument. For example, we can use filled circles by:

plt.plot(r, V_morse, marker='o')
plt.show()

Ohhhh wait... That looks terrible! Those circles are too dense. That’s because we used the pretty dense step size of 500 points. For now, let’s use a larger step size to get 50 points:

plt.plot(r[::10], V_morse[::10], marker='o')
plt.show()

Actually we have another way to solve this problem, which is to adjust the size of the markers by using the markersize (or abbreviated ms) keyword argument. For example, we can use a smaller size by:

plt.plot(r, V_morse, marker='o', markersize=2.5)
plt.show()

Well... That’s still not very pretty, but remember, in a real case, the step size is usually controlled by the resolution of your collecting instrument, rather than a linspace function, and you don’t need to plot markers for every single point if your data is large and points are close. Just show lines!

You can change to other markers if you want. For example, we can use triangles:

plt.plot(r[::10], V_morse[::10], marker='^')
plt.show()

Sometimes we don’t want to use lines but only markers. In that case, we can use the linestyle keyword argument with an empty string:

plt.plot(r[::10], V_morse[::10], linestyle='', marker='^')
plt.show()

Vice versa if we want to use lines but no markers, i.e., marker=''.

Even more, you can use math symbols in the marker keyword argument. Try this:

plt.plot(r[::10], V_morse[::10], linestyle='', marker=r'$\beta$')
plt.show()

Be sure to include r before the dollars wrapped in double quotations! Otherwise, it will not render as you want.

For more markers, you can check the Matplotlib Documentation - Marker reference.

Colors¶

In the plot above, we used the default color, which is blue. We can change the color by using the color (abbreviated as c) keyword argument. For example, we can use red by:

plt.plot(r[::10], V_morse[::10], linestyle='-', marker='x', color='red')
plt.show()

In the code above, we used the color keyword argument to specify the color. Make sure to use quotation marks around the color name. As usual, there are many more colors available in matplotlib. It’s not difficult to guess that green and yellow give green and yellow plots. You can even use an RGB color code or a hexadecimal color code. For example, we can use green by giving the RGB color code (0, 0.5, 0):

plt.plot(r[::10], V_morse[::10], linestyle='-', marker='x', color=(0, 0.5, 0))
plt.show()

Notice that the RGB code is wrapped within parentheses in a sequence (R, G, B), ranging from 0 to 1.

Let’s try one more for the thematic color of Yu~ The color #98caec:

plt.plot(r[::10], V_morse[::10], linestyle='-', marker='x', color='#98caec')
plt.show()

This is a hexadecimal color code. You need to start with # and then use six hexadecimal digits. Also, remember to wrap the color code within quotation marks.

What if we want to set different colors for the lines and the markers? And what if we’d like the marker’s edge and fill to have different colors? For this, we can use the markerfacecolor (abbreviated mfc) and markeredgecolor (abbreviated mec) keyword arguments. For example, we could draw a blue line with gold markers outlined in black using:

plt.plot(r[::10], V_morse[::10], linestyle='-', marker='o', color='blue', mfc='gold', mec='black')
plt.show()

For more about colors and their usage, you can find them in the Matplotlib Documentation - Color reference and Matplotlib Documentation - Specifying colors.

Not sure how to pick colors? Try Adobe Color —a free online tool that makes it easy to explore palettes and discover great combinations.

Exercise : Suppose we want to plot another common potential, the Lennard-Jones potential, also termed L-J potential or 12-6 potential, which is a simple potential that describes intermolecular interactions. The potential is given by:

V_\text{LJ} = 4\varepsilon \left[\left(\frac{\sigma}{r}\right)^{12} - \left(\frac{\sigma}{r}\right)^{6}\right]

(3)

where $\varepsilon$ is the well’s depth, $\sigma$ is the distance at which the potential is zero, and $r$ is the distance between the two molecules.

Define a function LJ(r, sigma, epsilon) that takes in the distance $r$ , $\sigma$ , and $\varepsilon$ as parameters and returns the value of the L-J potential. Plot this potential for molecular $\ce{H2}$ in the range $r \in [2.5, 10]\,\mathrm{Å}$ . Use $\sigma = 2.81\,\mathrm{Å}$ and $\varepsilon = 3.40 \times 10^{-3}\,\mathrm{eV}$ (data was taken from J. Mol. Struct. 2014, 068, 164–169). You should display the plot with a yellow dashed line and 30 circle markers with red edge, filled with your favorite color.

The Axes Module and Subplots¶

So far, we have learned how to make simple plots with lines and markers, and how to adjust their appearance, but what if we want to step things up and create more informative plots—with titles, axis labels, legends, custom ticks, or even multiple axes? This can be achieved through the axes module as subplots. However, we will not use axes module directly, but we will call functions from plt to meet our goals.

Image source: Matplotlib documentation

Titles and Legends¶

Before diving in, though, let’s take a break from the Morse potential. Don’t worry—we’ll come back to it later. For now, let’s switch things up and plot a classic function: $\sin x$ , since this function can better illustrate our concepts.

x = np.linspace(-np.pi, np.pi, 100)
y = np.sin(x)

plt.plot(x, y)
plt.show()

We can use the title and legend functions to add titles and legends to our plots. For example, we can add a title to the plot by calling plt.title() and passing in the title as a string:

plt.plot(x, y)
plt.title(r'Plot of $\sin x$')
plt.show()

Note that the $\sin$ in the title is displayed upright, non-italic, but in sans-serif font. To use a serif font, you can enable LaTeX rendering in matplotlib by setting:

plt.rcParams['text.usetex'] = True

(Rendering in LaTeX needs a full LaTeX installation; if you don’t know how to install it or have never heard it, just skip this block; otherwise you will see an error due to lack of LaTeX environment.) Try the following code:

plt.rcParams['text.usetex'] = True
plt.plot(x, y)
plt.title(r'Plot of $\sin x$')
plt.show()

You should be aware that the full LaTeX rendering is slow. There are some ways to turn around, but for now let’s move on. (we’ll get back later!)

Also, don’t forget to set the plt.rcParams back to its default value (False) after you’re done with the above plot.

plt.rcParams['text.usetex'] = False

We can also add a legend to the plot by calling plt.legend(). For using this, we need to label the lines by calling plt.plot() with the label keyword argument. That is, label='$\sin x$' as a parameter in the plot function. Then you can call plt.legend() to show the labels:

plt.plot(x, y, label=r'$\sin x$')
plt.title(r'Plot of $\sin x$')
plt.legend()
plt.show()

You can see that the legend appears in a small box in the upper-right corner of the plot, showing the label you specified in the plot function. The position of this legend can be customized with the loc keyword argument. For example, to place the legend in the upper-left corner, you can write:

plt.plot(x, y, label=r'$\sin x$')
plt.title(r'Plot of $\sin x$')
plt.legend(loc='upper left')
plt.show()

The loc argument accepts either descriptive string like upper left, 'lower right', 'center', etc. This gives you flexibility to place the legend exactly where it best fits your figure. To see all the available options, you can check the Matplotlib Documentation - Legends.

Axes, Ticks, and Scales¶

Customizing axes is straightforward. You can use plt.xlabel() and plt.ylabel() to label the $x$ - and $y$ -axes, respectively. For example, making a plot of $\sin x$ with properly labeled axes needs:

plt.plot(x, y, label=r'$\sin x$')

plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

plt.show()

Sometimes the default axis tick marks are not the most convenient for interpreting your data. For trigonometric functions, for example, it’s often clearer to label the $x$ -axis in terms of radians ( $-\pi. -\frac{\pi}{2}, 0, \frac{\pi}{2}, \pi$ ) instead of decimal numbers. We can easily do this with plt.xticks() after plotting:

plt.plot(x, y, label=r'$\sin x$')
plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

# Set x-axis from -pi to pi with multiples of pi/2 and 0
plt.xticks(
    [-np.pi, -np.pi/2, 0, np.pi/2, np.pi],
    [r'$-\pi$', r'$-\frac{\pi}{2}$', r'$0$', r'$\frac{\pi}{2}$', r'$\pi$']
)

plt.show()

Here, plt.xticks() takes two arguments: a list of tick positions (numerical values) and a list of corresponding labels (strings, can be in LaTeX syntax). This lets us replace the default decimal ticks with more meaningful values—like multiples of $\pi$ , which are the most natural scale for $\sin x$ . In the example below, we set the $x$ -axis to range from $-\pi$ to $\pi$ with ticks every $\pi/2$ :

plt.plot(x, y, label=r'$\sin x$')
plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

# Set x-axis ticks from -pi to pi with steps of pi/2 using np.arange()
# Note that we add 0.1 to the end of the range since we want to include the last tick
plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/2))

plt.show()

However, this approach will only display numerical values, not the nice fractional multiples of $\pi$ that we usually prefer in math. In the same way, you can use plt.yticks() to customize the labels on the $y$ -axis. I’ll leave that to you as a practice.

Other than the major ticks, which are the ticks that carry labels, there are also minor ticks, which are ticks that mark intermediate positions but usually do not have labels. By default, Matplotlib hides minor ticks, but we can turn them on when we want extra guidance on the axis scale. The easiest way to show minor ticks is to enable them with the minor=True option in plt.xticks() or plt.yticks(). For example, to add minor ticks along the $x$ -axis, you can write:

plt.plot(x, y, label=r'$\sin x$')
plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

# Add major ticks at multiples of pi/2
plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/2))

# Add minor ticks at multiples of pi/8
plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/8), minor=True)

plt.show()

If you have noticed, matplotlib automatically decides the axis ranges based on our data. That usually works fine, but sometimes we want to zoom in or out to highlight certain features. In those cases, we can manually set the visible range of the $x$ - and $y$ -axes. For example, we may want to plot a function of $x$ in the range $[0, 100]$ and $y$ in the range $[-1, 1]$ . We can do this with plt.axis(). For example, we can plot the function $\sin x$ in the range $[0, 100]$ and $y$ in the range $[-1, 1]$ by:

x2 = np.linspace(0, 100, 1000)
y2 = np.sin(x2)

plt.plot(x2, y2, label=r'$\sin x$')
plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

# Set x-axis from 0 to 100 and y-axis from -1 to 1; plt.axis([xmin, xmax, ymin, ymax])
plt.axis([0, 100, -1, 1])

plt.show()

If we want to only adjust the $x$ -axis or $y$ -axis, we can use plt.xlim() and plt.ylim(). The range of another axis will be determined automatically. For example, we can plot the function $\sin x$ in the range $[0, \pi]$ by:

plt.plot(x2, y2, label=r'$\sin x$')
plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

# Set x-axis from 0 to pi
plt.xlim([0, np.pi])

plt.show()

So far, we’ve been plotting with the default linear scale on both axes. But what if our data spans several orders of magnitude, or follows an exponential or power law trend? In those cases, a linear axis can make important features nearly invisible. Matplotlib allows us to change axis scales very easily using plt.xscale() and plt.yscale(). For example, let’s plot an exponential function and see how it looks under different scales:

x3 = np.linspace(0.1, 10, 400)
y3 = np.exp(x3)

plt.plot(x3, y3, label=r'$e^x$')
plt.xlabel(r'$x$')
plt.ylabel(r'$e^x$')

plt.title(r'Plot of $e^x$ for $y$-Axis in Logarithmic Scale')
plt.legend()

# Switch the y-axis to logarithmic scale
plt.yscale('log')

plt.show()

We see that the exponential function appears as a straight line when plotted on a logarithmic scale. This kind of treatment is very useful when we want to recover linearity from exponential data. Consider another example: the first-order hydrolysis of tert-butyl bromide:

\ce{(CH3)3CBr + NaOH -> (CH3)3COH + NaBr}

(4)

The rate law for this reaction is

\text{rate} = -\frac{d[\ce{(CH3)3CBr}]}{dt} = k[\ce{(CH3)3CBr}]

(5)

Solving this differential equation gives the integrated rate law:

[\ce{(CH3)3CBr}] = [\ce{(CH3)3CBr}]_0 \cdot e^{-kt}

(6)

This expression is exponential in $t$ . To analyze such data more easily, we usually take the natural logarithm of both sides to obtain a linear form ( $y = mx + b$ ):

\underbrace{\ln [\ce{(CH3)3CBr}]}_{y} = \underbrace{-k}_m \underbrace{t}_x + \underbrace{\ln[\ce{(CH3)3CBr}]_0}_b

(7)

Alternatively, instead of transforming the equation, we can plot the original exponential form while setting the $y$ -axis to a logarithmic scale. This way, the exponential decay appears as a straight line without explicitly taking logarithms.

k = 0.1  # rate constant, set to 0.1 for simplicity
c0 = 1.0 # initial concentration, set to 1.0 for simplicity

t = np.linspace(0, 100, 200)
c = c0 * np.exp(-k * t)

plt.plot(t, c, label=r'$[\mathrm{(CH_3)_3CBr}]$')
plt.xlabel(r'$t$ (s)')
plt.ylabel(r'$[\mathrm{(CH_3)_3CBr}]$ (M)')
plt.title('First-Order Hydrolysis of tert-Butyl Bromide')
plt.yscale('log')  # log scale for y-axis
plt.legend()
plt.show()

We will see more applications of this kind of analysis in the fitting and optimization lesson.

Grids and Axes Spines¶

Next, let’s make our plot easier to read by adding a grid. A grid overlays horizontal and vertical reference lines on the figure, which helps the eye track values more precisely. In Matplotlib, this is done with the plt.grid() function. For example, we can add a grid to the plot of $\sin x$ with:

plt.plot(x, y, label=r'$\sin x$')

plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/2))
plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/8), minor=True)

plt.grid()

plt.show()

What do you think the default grid color and style are? For me, these solid line grids are kind of distracting. What if we want to change them, for instance, to dashed lines? You can actually do it pretty easily by using plt.grid(linestyle='--').

plt.plot(x, y, label=r'$\sin x$')

plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/2))
plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/8), minor=True)

plt.grid(linestyle='--')

plt.show()

All functionalities that we have learned so far are available in the plt.grid module, like changing the color, style, and linewidth.

plt.plot(x, y, label=r'$\sin x$')

plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/2))
plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/8), minor=True)

plt.grid(linestyle=':', color='#98caec', linewidth=1)

plt.show()

The spines are the lines that mark the boundaries of the plotting area (top, bottom, left, and right). By default, all four spines are visible, but we can adjust their visibility, position, and style to improve the clarity of a plot. Adjusting spines is a little different: you need to access them through plt.gca().spines rather than plt.spines. For example, we can hide the top and right spines of the plot with:

plt.plot(x, y, label=r'$\sin x$')

plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/2))
plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/8), minor=True)

plt.grid(linestyle='--')
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)

plt.show()

In the example above, we hide the top and right spines of the plot by using the set_visible() method. Other spine properties include set_color(), set_linewidth(), and set_linestyle(). For example, we can change the color of the top spine to red by:

plt.plot(x, y, label=r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/2))
plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/8), minor=True)

plt.grid(linestyle='--')

# Adjust spine properties
plt.gca().spines['left'].set_position(('data', 0))   # move y-axis to x=0
plt.gca().spines['bottom'].set_position(('data', 0)) # move x-axis to y=0
plt.gca().spines['left'].set_color('blue')           # change color
plt.gca().spines['bottom'].set_linewidth(2)          # make thicker
plt.gca().spines['top'].set_visible(False)           # hide top spine
plt.gca().spines['right'].set_visible(False)         # hide right spine

plt.show()

A summary of the most important methods for controlling spines is given below:

Property	Method / Argument	Description
Visibility	`spine.set_visible()`	Show (`True`) or hide (`False`) a spine.
Position	`spine.set_position()`	Control where the spine is placed (`'data'`, `'axes'`, `'outward'`).
Color	`spine.set_color()`	Set the spine color (e.g., named color or hex code).
Linewidth	`spine.set_linewidth()`	Adjust the thickness of the spine in points.
Linestyle	`spine.set_linestyle()`	Change the style of the spine (`'-'`, `'--'`, `':'`, `'-. '` etc.).

Other Figure Settings¶

Here are a few additional figure options you may find useful. To change the overall figure size, use plt.figure(figsize=(width, height)). For example, the code below creates a figure that is 10 inches wide and 5 inches tall:

# set figure size to 10x5 inches
plt.figure(figsize=(10, 5))

plt.plot(x, y, label=r'$\sin x$')

plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/2))
plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/8), minor=True)

plt.grid(linestyle='--')
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)

plt.show()

We can control the resolution of the figure by using another argument in plt.figure(): dpi. For example, the code below creates a figure with a resolution of 300 dots per inch:

# set figure resolution to 300 dots per inch
plt.figure(figsize=(10, 5), dpi=300)

plt.plot(x, y, label=r'$\sin x$')

plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/2))
plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/8), minor=True)

plt.grid(linestyle='--')
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)

plt.show()

The default figure resolution is 100 dots per inch (dpi=100).

We can also adjust the figure background color and border with facecolor and edgecolor in plt.figure(). For example, the code below creates a Yu-blue background with a Miku-teal border:

# set figure background color to Yu-blue and border color to Miku-teal
plt.figure(figsize=(10, 5), facecolor='#98caec', edgecolor='#39c5bb', linewidth=12)

plt.plot(x, y, label=r'$\sin x$')

plt.xlabel(r'$x$')
plt.ylabel(r'$\sin x$')

plt.title(r'Plot of $\sin x$')
plt.legend()

plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/2))
plt.xticks(np.arange(-np.pi, np.pi+0.1, np.pi/8), minor=True)

plt.grid(linestyle='--')
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)

plt.show()

Exercise: Make a plot based on the following directions.

Create a well-formatted plot of $\cos{x}$ from 0 to $4\pi$ .
Show appropriate $x$ - and $y$ -axes, with major and minor ticks illustrating zeros, minima, and maxima.
Show a title, legend, and grid.
Show spines at the bottom and left.
Set the figure size to be 8 inches wide and 4 inches tall.

Use the code below as your starting point.

x4 = ...  # Choose appropriate range and step size
y4 = np.cos(x4)

Overlaying Multiple Plots¶

It’s quite easy to overlay multiple plots in a single figure. Just use the plt.plot() function multiple times, and you’ll get a plot with multiple lines. For example, the code below verifies the trigonometric identity

\sec^2 x - \tan^2 x = 1

(8)

x_mult = np.linspace(-2.5*np.pi+0.1, 2.5*np.pi-0.1, 500)
y_mult_1 = np.tan(x_mult) ** 2
y_mult_2 = np.cos(x_mult) ** -2  # Since NumPy does not natively support the secant, we use cosine inverse instead
y_mult_3 = y_mult_2 - y_mult_1

plt.figure(figsize=(12, 5))
plt.plot(x_mult, y_mult_1, label=r'$\tan^2 x$')
plt.plot(x_mult, y_mult_2, label=r'$\sec^2 x$')
plt.plot(x_mult, y_mult_3, label=r'$\sec^2 x - \tan^2 x$', linestyle='--')

plt.title(r'Trigonometric Identity $\sec^2 x - \tan^2 x = 1$')
plt.legend()
plt.xlabel(r'$x$')
plt.ylabel(r'$y$')
plt.ylim(-0.5, 4)

plt.xticks(np.arange(-2.5*np.pi, 2.5*np.pi+0.1, np.pi/2))
plt.xticks(np.arange(-2.5*np.pi, 2.5*np.pi+0.1, np.pi/8), minor=True)
plt.grid(linestyle=':')

plt.show()

Exercise : Now it’s the time to take our Morse potential back! We want to compare the Morse potential with the harmonic potential. Plot these two potentials on the same figure. You should first define a function called harmonic, which takes values of $r$ , $r_0$ , and $k$ as input, and returns the value of the harmonic potential at that point. Recall that the harmonic potential is given by

V_\text{harmonic}(r) = \frac{1}{2} k \left( r - r_0 \right)^2

(9)

Define harmonic(r, k, r0) in the code cell below.

Here $k$ is the spring constant, which is given by

k = 2D_e a^2

(10)

Define k in the code cell below, which has the unit $\mathrm{eV/Å^2}$ .

Use the code below as your starting point. You can also refer to the code of Morse plot in the previous section. Feel free to add additional features to your plot to make it pretty(´▽｀)

r = np.linspace(0, r0 + 2.0, 500)  # Å
V_morse = ...  # Morse potential
V_harmonic = ...  # Harmonic potential

Hint: How can this plot appropriately reflect the comparison between two potentials? The upper limit of your $y$ -axis should be somehow slightly larger than $D_e$ , and the lower limit should be...

Exercise : We want to test Beer–Lambert’s law. Don’t tell me you don’t know it!

...Well, probably you really don’t...

Oh, god...okay, fine, I’ll tell you...

(*Cough) The Beer–Lambert law describes the relationship between the concentration of a substance in a solution and the amount of light it absorbs. In simple terms, the more concentrated the solution is, the more light it absorbs as it passes through. This principle is widely used in chemistry and biology to determine the concentration of compounds using spectrophotometry, making it a fundamental tool for quantitative analysis.

Mathematically, this law states that absorbance is directly proportional to both the concentration of the absorbing species and the path length of the light through the solution, expressed as:

A = \varepsilon c l

(11)

Here,

$A$ is the absorbance, which is dimensionless—since it is a ratio;
$\varepsilon$ is the molar absorptivity or molar extinction coefficient, in the unit $\mathrm{L \cdot mol^{-1} \cdot cm^{-1}}$ , which indicates how strongly the substance absorbs light at a given wavelength;
$c$ is the concentration of the absorbing species, usually takes the unit $\mathrm{mol \cdot L^{-1}}$ ;
$l$ is the path length of the light through the solution, usually the width of the cuvette, in $\mathrm{cm}$ .

For methylene blue, the molar absorptivity at $665\,\mathrm{nm}$ is $6.71 \times 10^{4}\,\mathrm{L \cdot mol^{-1} \cdot cm^{-1}}$ in water (React. Chem. Eng. 2020, 5, 377–386). Define $\varepsilon$ in the code cell below as eps.

eps = 6.71e4

Plot a standard curve of absorbance versus concentration from 0 to $5 \times 10^{-5}\,\mathrm{M}$ using dashed line. Assume that $l = 1\,\mathrm{cm}$ in this whole exercise. You should have major $x$ -ticks at $1 \times 10^{-5}, 2 \times 10^{-5}, 3 \times 10^{-5}, 4 \times 10^{-5}, 5 \times 10^{-5}\,\mathrm{M}$ , also including minor ticks appropriately. For $y$ -ticks, try to include minor ticks that appropriately represent the values from 0 to 3.5.

Yue 20 Van Nya “ 5min ” ~~is on the way!~~ has acquired a set of data from a spectrometer. The data is stored in an array:

data_conc = np.array([2.34, 4.69, 9.38, 18.76, 37.52]) * 1e-6
data_absor = np.array([114, 514, 810, 1919, 3810]) * 1e-3

Plot the data as a scatter plot along with the standard curve.

Do you think Van Nya made a good measurement? ~~If not, don’t hesitate to slap him with your hands! o(｀ω´ )o~~

You can check the accuracy of your measurement by plotting the residuals of the data against the standard curve, using the code below:

plt.plot(data_conc, data_conc * eps - data_absor, linestyle='', marker='o', label='Residuals')
plt.hlines(0, -1e-5, 6e-5, colors='red', linestyles='--', label='Zero Error')  # This creates a horizontal line at y=0 from -1e-5 to 6e-5
# Similarly, you should expect a function `plt.vlines(x, ymin, ymax, ...)` to create a vertical line at x=x from ymin to ymax
# Although we don't need it here:)

plt.title('Residuals of Van Nya\'s data')
plt.legend()

plt.xlabel(r'Concentration ($\mathrm{M}$)')
plt.ylabel(r'$\Delta A$')

plt.axis([0, 5e-5, -1.5, 1.5])

plt.grid(linestyle='--')

plt.xticks(...)  # You should know how to do this! Use the same ticks as before

plt.show()

Do you see a clear pattern in the residuals? That means that the data has systematic errors. A good measurement should have residuals randomly distributed around zero.

Another student has measured the same data using the same set of sample solutions (so you can use data_conc as your $x$ -axis data), and the results are stored in the array data_absor_2. Plot the data as a scatter plot along with the standard curve.

data_absor_2 = np.array([0.15, 0.35, 0.65, 1.25, 2.30])

Check the accuracy of the measurement by plotting the residuals.

You may notice that at $3.752 \times 10^{-5}\,\mathrm{M}$ , the absorbance deviates a little bit from the standard curve. This is because we actually reached the limit of linearity (LoL) of Beer–Lambert’s law, at around $A = 1$ . Beyond this point, the absorbance is no longer linear but will gradually reach a plateau (why?). Nonetheless, besides this point as an outlier, other points are still close to the standard curve, as we have also seen in the residual plot—a close, but random distribution.

Multiple Subplots in a Single Figure¶

Sometimes we may want to plot multiple subplots in a single figure. For example, we may want to plot the relationship between concentration and time for zeroth, first, and second order reactions to make better comparisons. However, these plots use different $y$ scales, so we cannot put them in a single overlay. To make multiple subplots, we can use the subplot function.

First, we need to generate a figure using plt.figure(). Then, we can use plt.subplot(rows, columns, plot_number) to create subplots. The first argument rows specifies the number of rows in the figure, and the second argument columns specifies the number of columns. The third argument plot_number specifies the plot number in the figure, which starts from 1. This numbering always follows left-to-right, then top-to-down. Let’s make a simple demonstration:

A0 = 1.0          # initial concentration (M)
k0 = 0.02         # zeroth-order rate constant (M·s^-1)
k1 = 0.10         # first-order rate constant (s^-1)
k2 = 0.20         # second-order rate constant (M^-1·s^-1)
t  = np.linspace(0, 40, 100)  # time (s)

A_zeroth = A0 - k0*t
A_first  = A0 * np.exp(-k1*t)
A_second = A0 / (1 + k2*A0*t)

plt.figure(figsize=(12, 4))

# Zeroth order
plt.subplot(1, 3, 1)
plt.plot(t, A_zeroth, label='data')
plt.xlabel('Time (s)')
plt.ylabel(r'$[A]$ (M)')
plt.title(r'Zeroth Order: $[A] = [A]_0 - kt$')

# First order
plt.subplot(1, 3, 2)
plt.plot(t, np.log(A_first), label='data')
plt.xlabel('Time (s)')
plt.ylabel(r'$\ln [A]$')
plt.title(r'First Order: $\ln [A] = \ln [A]_0 - kt$')

# Second order
plt.subplot(1, 3, 3)
plt.plot(t, 1/A_second, label='data')
plt.xlabel('Time (s)')
plt.ylabel(r'$1/[A]$ (M$^{-1}$)')
plt.title(r'Second Order: $1/[A] = 1/[A]_0 + kt$')

plt.tight_layout() # This function automatically adjusts the spacing between subplots. You should use it when making multiple subplots!
plt.show()

What if you want to make a comparison between non-modified and modified first and second order reactions? We can make four subplots in a single figure:

plt.figure(figsize=(8, 8))

# Non-modified first order
plt.subplot(2, 2, 1)
plt.plot(t, A_first)
plt.xlabel('Time (s)')
plt.ylabel(r'$[A]$ (M)')
plt.title('Non-modified First Order Kinetics')

# Non-modified second order
plt.subplot(2, 2, 2)
plt.plot(t, A_second)
plt.xlabel('Time (s)')
plt.ylabel(r'$[A]$ (M)')
plt.title('Non-modified Second Order Kinetics')

# Modified first order
plt.subplot(2, 2, 3)
plt.plot(t, np.log(A_first))
plt.xlabel('Time (s)')
plt.ylabel(r'$\ln [A]$ ($\ln \mathrm{M}$)')
plt.title('Modified First Order Kinetics')

# Modified second order
plt.subplot(2, 2, 4)
plt.plot(t, 1/A_second)
plt.xlabel('Time (s)')
plt.ylabel(r'$1/[A]$ ($\mathrm{M}^{-1}$)')
plt.title('Modified Second Order Kinetics')

plt.tight_layout()
plt.show()

Here’s another example of using multiple subplots in one figure. We will plot the IR spectrum of a 10% acetic acid solution in organic solvents (data from the NIST Chemistry WebBook). The goal is to show the full spectrum across the top and two zoomed-in regions on the bottom. Note that the top plot spans both columns, while the two bottom plots each occupy one column.

# Load data with numpy but skip the header row containing column names
data = np.loadtxt('./assets/acetic_acid.txt', skiprows=1)

# First column = Wavelength, next five columns = trials
wavenumber = data[:, 0]
trials = data[:, 1:6]
# The code above is used to load the data from the file - we will introduce this later

# Average across the five trials
avg = np.mean(trials, axis=1)

plt.figure(figsize=(12, 8))

# Top: full spectrum
plt.subplot(2, 2, (1, 2))
plt.plot(wavenumber, avg, color='black')
plt.xlabel('Wavenumber (cm$^{-1}$)')
plt.ylabel('Transmittance')
plt.title('Full IR Spectrum of Acetic Acid')
plt.gca().invert_xaxis()  # Conventional IR plotting (high to low wavenumber)
plt.grid(linestyle=':')

# Bottom left: O–H stretching region (3600–2200 cm⁻¹)
plt.subplot(2, 2, 3)
plt.plot(wavenumber, avg, color='blue')
plt.xlim(3600, 2200)  # Do not need to invert axis, since we have already set an inverted xlim
plt.xlabel('Wavenumber (cm$^{-1}$)')
plt.ylabel('Transmittance')
plt.title('O–H Stretching Region (3600–2200 cm$^{-1}$)')
plt.grid(linestyle=':')

# Bottom right: C=O stretching region (1900–1500 cm⁻¹)
plt.subplot(2, 2, 4)
plt.plot(wavenumber, avg, color='red')
plt.xlim(1900, 1500)  # Do not need to invert axis, since we have already set an inverted xlim
plt.xlabel('Wavenumber (cm$^{-1}$)')
plt.ylabel('Transmittance')
plt.title('C=O Stretching Region (1900–1500 cm$^{-1}$)')
plt.grid(linestyle=':')

plt.tight_layout()
plt.show()

Making Outputs¶

Finally! Finaaaaaaly!!! This is the last section of this lesson (though you still have an end-of-lesson problem~). The last thing after we made a plot is to save it. This can be done by using the savefig function. The first argument is the path of the file, and you can customize the file format, transparency, resolution, and more. Specifically:

my_x = np.linspace(0, 10, 100)
my_y = my_x ** 2

fig = plt.figure(figsize=(5, 4))  # Here we assign the entire figure to a variable, which can be used later

plt.plot(my_x, my_y)
plt.title('A Naïve Quadratic Function')
plt.xlabel('$x$')
plt.ylabel('$y = x^2$')

plt.savefig('./output/Naïve.png', transparent=True, format='png', dpi=300)

There are a lot of formats that you can use, including .png, .jpg, .pdf, .svg, and .eps. Try to output a PDF file:

fig.savefig('./output/Naïve2.pdf', transparent=True, format='pdf')

See? If we want to save a figure as multiple files, we can assign the entire figure to a variable (for example, fig), then use the fig.savefig() function multiple times.

End-of-Lesson Problem: Arrhenius Kinetics¶

In this problem, you need to design a clean, publication-ready figure using concepts from this lesson. Be aware that the units used in this problem may not be consistent—you should adjust them accordingly!

You want to run a simulation of a chemical reaction and test its kinetic parameters, obtaining the activation energy.

First, let’s generate our data. Create temperatures $T$ from $260–560\,\mathrm{K}$ (25 points, in a variable called temp). Choose $A = 2.0\times10^{12}\ \mathrm{s^{-1}}$ and $E_a = 48.0\ \mathrm{kJ\,mol^{-1}}$ . With $R = 8.314\ \mathrm{J\,mol^{-1}\,K^{-1}}$ , compute

k(T) = A\,\exp\!\left(-\frac{E_a}{RT}\right)

(12)

Add a random noise to the data of $k$ using the code:

np.random.seed(114514)  # fix random seed for reproducibility
noise = np.random.uniform(0.75, 1.25, size=temp.shape)

The variable noise is a random array of the same shape as temp, containing noise values. You need to multiply the $k$ -data by noise to obtain the modified $k$ -data.

Make a 1×2 figure. On the left, plot $k$ vs. $T$ with a logarithmic $y$ -scale. This should be a scatter plot. Add axis labels, a concise title, and a dotted grid. Use major ticks every $50\,\mathrm{K}$ on $T$ and minor ticks in between. Place a legend in a corner that doesn’t cover data. Hide the top and right spines, slightly thicken the left and bottom spines, and keep tick labels clear.

For the right subplot, plot $\ln k$ vs. $1000/T$ , also a scatter plot. Here you are going to do linearization:

\ln k = - \frac{E_a}{RT} + \ln A

(13)

Set readable $x$ -ticks spanning the full $1000/T$ range and label the axis. Add a dotted grid. Also do the same thing for the spines as in the left subplot.

Choose two clearly separated points $(x_1, y_1)$ and $(x_2, y_2)$ from your $\ln k$ vs. $1000/T$ plot, where $x = 1000/T$ and $y = \ln k$ . Compute the slope $m = \frac{y_2 - y_1}{x_2 - x_1}$ . Convert this to an activation energy using $E_a \approx -mR \times 1000$ (the factor 1000 accounts for using $1000/T$ ). Report $E_a$ in $\mathrm{kJ\,mol^{-1}}$ . Print the two points you chosen and the estimated $E_a$ below your figure (just a simple text using print()). Plot a dashed line between these two chosen points to indicate the trend (this is not a regression, but just a simple line).

On the semilog $k$ vs. $T$ panel (left), mark the lowest-temperature data point with a distinct marker.

Give the figure a balanced look: consistent fonts on labels, matching line widths, unobtrusive grid, etc. Call plt.tight_layout() and save a transparent 300 dpi PNG and a PDF. In one sentence, state when you would prefer PNG vs. PDF for this type of figure.

There are a lot of tasks in this problem. You may need to add more code cells—don’t panic! Try to break down into small parts and solve them one by one~

Matplotlib Official Website
Charles J. Weiss’s Scientific Computing for Chemists with Python
An Introduction to Python for Chemistry
GenAI for making paragraphs and codes(・ω< )★
And so many resources on Reddit, StackExchange, etc.!

Acknowledgments¶

This lesson draws on ideas from the following sources:

References¶

Mamedov, B. A., & Somuncu, E. (2014). Analytical treatment of second virial coefficient over Lennard-Jones (2n−n) potential and its application to molecular systems. Journal of Molecular Structure, 1068, 164–169. 10.1016/j.molstruc.2014.04.006
Katzenberg, A., Raman, A., Schnabel, N. L., Quispe, A. L., Silverman, A. I., & Modestino, M. A. (2020). Photocatalytic hydrogels for removal of organic contaminants from aqueous solution in continuous flow reactors. Reaction Chemistry & Engineering, 5(2), 377–386. 10.1039/c9re00456d