Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Lesson 1-extra: More about Python Programming

Instructor: Yuki Oyama, Prprnya

The Christian F. Weichman Department of Chemistry, Lastoria Royal College of Science

This material is licensed under CC BY-NC-SA 4.0

Congratulations! You have finished the first lesson of the course! Now you are ready to dig deeper into Python. This lesson will introduce you to some advanced topics in Python. For now, let’s get started!

Reading Documentation Is Important!

As an advanced Python learner, you should make reading documentation a habit. Python provides a comprehensive documentation website that covers most of the details of this language. This website contains documentation for different versions of Python, so make sure to select the version you are using. You can also find documentation for specific modules and libraries that you may be using in your projects. For example:

Builtin Collection Types

Collections are data structures that store multiple items. Python provides a variety of collection types, including list, tuple, str, range, set, and dict.

Tuple

Tuple is an immutable builtin collection type in Python, where “immutable” means that its elements cannot be changed once created. In Python code, you can use parentheses (()) to create a tuple object. For example, the following code creates a tuple of integers from 1 to 5:

t1 = (1, 2, 3, 4, 5)
t1

Even if we do not surround our values with parentheses, Python will still pack them into a tuple automatically. For example, the following code creates a tuple of integers from 6 to 10:

t2 = 6, 7, 8, 9, 10
t2

Using builtin function type() to check the type of t1 and t2:

type(t1), type(t2)

Since tuple is immutable, we are not able to modify an existing tuple in place. Instead, we can create a new tuple by unpacking two existing tuples into a new tuple using the * syntax. For example, the following code creates a new tuple t by unpacking t1 and t2:

t = (*t1, *t2)
t

To get the length of a tuple, you can use the builtin len() function:

len(t)

As a sequence, tuples can be indexed by integers. The index of collections, including tuple, starts from 0 in Python, which means that the first element has index 0. For example, the following code extracts the second element of t:

t[1]

When the index is negative, it counts from the end of the tuple. In this case, -1 refers to the last element of the tuple, -2 refers to the second-last element, and so on. For example, the following code extracts the third last element of t:

t[-3]

Notice that the negative index -i is equivalent to len(t) - i. The previous example is equivalent to the following code:

t[len(t) - 3]

Tuples can also be sliced. The general syntax of slicing is start:stop or start:stop:step, where start is the index of the first element to include, stop is the index of the first element to exclude, and step is the step size. If start is omitted, it defaults to 0. If stop is omitted, it defaults to the length of the tuple. If step is omitted, it defaults to 1. For example, the following code creates a slice of t from index 2 to 5 (exclusive):

t[2:5]

Here is another example that creates a slice of t from index 1 to 7 (exclusive) with a step size of 2:

t[1:7:2]

Especially, the following code gives us a reversed version of t:

t[::-1]

Sometimes we need to extract some elements from a tuple and assign them to some variables. This operation can be done by the following syntax:

one, _, three, four, _ = t1
one, three, four

Expanding the values of t1 (in the form without parentheses), we can see what happens in the previous example. It is actually equivalent to one, _, three, four, _ = 1, 2, 3, 4, 5, which is assigning multiple values to multiple variables. The _ items appeared in assignment are usually used to drop these values, even though _ itself is a valid variable name.

Combine the syntax of unpacking and multiple assignments, we can even extrack a specific number of elements from any tuple. For example, the following code extracts the first two elements of t2 and drops the rest:

six, seven, *_ = t2
six, seven

Sometimes we want to keep the unpacked elements extracted from a tuple. To do this, give the unpacked elements a new name. For example, the following code divides t into three parts—the first element, the middle elements, and the last element:

first, *middle, last = t
first, middle, last

See? The value of middle is surrounded by brackets ([]), which belongs to a new type that we will introduce next.

List

List is a mutable sequence of objects, where “mutable” means that its elements can be changed. In Python code, you can use brackets ([]) to create a list object. For example, the following code creates a list of integers from 1 to 5:

l1 = [1, 2, 3, 4, 5]
l1

List can also be created by unpacking another collection in brackets. For example, the following code creates a list of integers from the tuple t2:

l2 = [*t2]
l2

Check the type of l1 and l2:

type(l1), type(l2)

Since list is also a collection type, most of the operations that can be applied to a tuple can also be applied to a list. Here are some examples:

l = [*l1, *l2]
l
l[1], l[-3], l[2:5], l[1:7:2], l[::-1]
one, _, three, four, _ = l1
one, three, four
six, seven, *_ = l2
six, seven
first, *middle, last = l
first, middle, last

As a mutable sequence, list has a bunch of methods (which are functions binding to a specific type) that can be used to manipulate its elements. For example, the append() method appends an element to the end of l:

l.append(11)
l

Here, append() is a method of list type, which should be called from an instance of list. The pop() method removes and returns an element from the end of l, defaulting to the last element if index is not specified:

l.pop()

Check values of l after popped:

l

This time, we specify the index of the element to be removed:

l.pop(3)

Check values of l again:

l

Correspondingly, insert() method inserts an element at a specified position, which is called as insert(index, element):

l.insert(3, 4)
l

List also supports more operations of mutable sequence types, which are listed in the document just linked to.

String

String is an immutable sequence of characters with a bunch of unique methods to handle string-related operations. In Python code, you can use single quotes (') or double quotes (") to create a str object. For example:

sport = 'snooker'
score = "147"

sport, score

You may notice that no matter which quote you use to create a string, Python always represents it using single quotes by default. Now check the type of these two strings:

type(sport), type(score)

As a sequence type, the uniqueness of str is that its elements also belong to the str type. Actually, Python does not have a builtin character type. Check the characters extracted from the previous example:

type(sport[1]), type(score[-3])

To create a string with multiple lines, you can use triple quotes (''' or """). For example:

ms1 = '''Multi-line string with
three single quotes.'''
ms2 = """Multi-line string with
three double quotes."""

ms1, ms2

The \n in the output is the escape sequence of line break in Python, where \ is the escape character. To display a multi-line string properly, you can print it out:

print(ms1)
print(ms2)

Python also supports other escape sequences. For example, when we want to use a single quote in a single-quoted string (or a double quote in a double-quoted string), we can use the escape sequence \ followed by the quote character:

print('I\'m a single quote.')
print("He says, \"I'm a double quote.\"")

When we want to display a backslash in a string, we can use the escape sequence \\ to escape the backslash itself. For example, the Python string converted from the LaTeX code of 12\frac{1}{2} should be like this:

print('\\frac{1}{2}')

However, too many escape sequences can make the code hard to read. To avoid this, you can use the raw string syntax, which is defined by the prefix r before the quote character. For example, eiωt=cosωt+isinωte^{i\omega t} = \cos{\omega t} + i\sin{\omega t} can be represented as:

print(r'e^{i\omega t} = \cos{\omega t} + i\sin{\omega t}')

There are many unique methods of str type, which are listed in the document. For example, the upper() method converts all characters in a string to uppercase:

sport.upper()

Since str is immutable, these methods return a new string instead of modifying the original string. You can check the original string:

sport

Now let’s turn our attention to the most powerful feature of strings—formatting. For example, let’s format a string to describe the maximum score of snooker:

f'The maximum score of {sport} is {score}'

To display braces ({}) in an f-string, we can use the unique escape sequence {{ and }}. Using LaTeX code H^ij=E(i2+j2)\hat{H}_{ij} = E(i^2 + j^2) as an example, we have

H = r'\hat{H}'
E = 11.4514

print(f'{H}_{{ij}} = {E:.2f}(i^2 + j^2)')

Range

Range is a collection of integers like a slice, which means that range(start, stop, step) creates a range of integers from start to stop with a step of step, like the indices of slice start:stop:step. For example:

range(1, 10, 3)

However, unlike other collection types, we cannot see the elements of a range directly. One of the ways to get the elements of a range is to dump it into a list:

[*range(1, 10, 3)]

If we want to create a nonempty range of negative step, we should make sure that the start is larger than stop.

[*range(10, 1, -3)]

Set

Set is an unordered collection of unique elements. In Python code, you can use braces ({}) to create a set object. For example, the following code creates a set of powers of prime numbers within 20:

s1 = {2**0, 2**1, 2**2, 2**3, 2**4, 3**0, 3**1, 3**2, 5**0, 5**1}
s1

Note that the set will merge duplicate elements such as 2**0, 3**0, and 5**0. Unpacking another collection into a set is also supported:

s2 = {*t2}
s2

The set type has the following methods to support set operations, including union(), intersection(), and difference():

s1.union(s2)
s1.intersection(s2)
s1.difference(s2)

Dictionary

Dictionary is a mutable collection of key-value pairs. In Python code, you can use braces ({}) to create a dict object. Different from set, the elements of a dictionary are in the form of key: value. For example, the following code creates a dictionary of students’ names and grades:

d = {'Alice': 95, 'Bob': 85, 'Charlie': 75}
d

To access the value of a key, you can use the indexing syntax dict[key]:

d['Alice'], d['Bob']

Since dict is mutable, you can also modify the value of a key by the assignment syntax:

d['Alice'] = 100
d

Appending a new key-value pair to a dictionary also uses the assignment syntax:

d['David'] = 65
d

Control Flows

Control flows are the basic building blocks of programming. They help you to organize your code and make it more readable. The most common control flows are if, while, and for statements.

Conditional Statements (if)

The if statement is used to execute a block of code if a certain condition is met. This statement uses the if keyword, followed by an expression of type bool as the condition. For example, the following code prints n is even if the integer n is divisible by 2:

n = ...

if n % 2 == 0:
    print('n is even')

Replace ... with an integer value and run the code to see the result.

Sometimes we want to execute different blocks of code depending on whether the condiction is met. In this case, we can use the if-else statement. For example, in the previous example, if we want to print out the case where n is odd, we can add the else statement as follows:

n = ...

if n % 2 == 0:
    print('n is even')
else:
    print('n is odd')

In more complex cases, we need to provide different blocks of code for multiple conditions. The elif statement is used for this purpose, which can be considered as an abbreviation of else if in other programming languages. For example, the following code prints whether n is positive, negative, or zero:

n = ...

if n > 0:
    print('n is positive')
elif n < 0:
    print('n is negative')
else:
    print('n is zero')

Loops (while)

The while statement is used to execute a block of code repeatedly when a certain condition (which also belongs to bool type) is met. For example, the following code shows how to use Euclidean algorithm to find the greatest common divisor of two integers:

m = ...
n = ...
print(f'The greatest common divisor of {m} and {n} is', end=' ')

while m % n != 0:
    m, n = n, m % n

print(n)

Iterations (for)

The for statement is used to iterate over a collection of items. For example, the following code prints the elements in our tuple t1:

for i in t1:
    print(i)

To access the index of an item in a collection, you can use the built-in function enumerate(). Using our tuple t2 as an example:

for i, x in enumerate(t2):
    print(f't2[{i}] = {x}')

To iterate over a collection of items, you can use the built-in function zip(). Using our sets s1 and s2 as examples:

for x, y in zip(s1, s2):
    print(f'{x=}, {y=}')

To iterate over a dictionary, you can use the built-in method items() of dict type. Using our dictionary d as an example:

for name, grade in d.items():
    print(f'The grade of {name} is {grade}.')

A Deep Dive into Functions

Functions in Python are snippets of code that can be reused in your code, which input some parameters and output some results. They are defined using the def keyword, followed by a function name, a list of parameters, and a block of code.

Parameters and Arguments

When we define a function, we can specify the inputs of the function using parameters, which are the named entities between parentheses. When we call a function, we can pass in some values to these parameters, which are called arguments. To see the difference between parameters and arguments, check this FAQ. Usually, the order of arguments when calling a function is the same as the order of parameters when defining the function. This is called a positional argument. For example, the following code defines a function power() that takes base as the base and exp as the exponent, and returns the result of base ** exp:

def power(base, exp):
    return base ** exp

power(2, 3)

However, we can also specify the order of arguments when calling a function using keyword arguments, which has the form param=arg. Using the previous example, we can call the function as follows:

power(exp=2, base=3)

Moreover, we can even mix positional and keyword arguments when calling a function. The only thing to note is that positional arguments must appear before keyword arguments. For example:

power(2, exp=4)

The special parameter / can force the parameter before it to be a positional parameter. For example, inserting / between base and exp in the previous example will make base a positional-only parameter:

def power(base, /, exp):
    return base ** exp

power(2, 5), power(4, exp=3)

If we want to capture a bunch of arguments in one place, we can use the * operator for positional arguments, and the ** operator for keyword arguments, to unpack them into a collection. For example, the following code defines a function print_args that prints out the values of all positional arguments and keyword arguments:

def print_args(*args, **kwargs):
    print(f'Positional arguments {type(args)}: {args}')
    print(f'Keyword arguments {type(kwargs)}: {kwargs}')

print_args(1, 2, 3, a=4, b=5)

Notice that the type of args is tuple, and the type of kwargs is dict. Therefore, we can index, slice, and unpack the captured arguments as we have learned. Since the parameter *args captures all positional arguments, any parameter after it can only accept keyword arguments. For example, the following code defines a function sum_args() that sums up all the positional arguments from an initial value:

def sum_args(*args, init):
    result = init
    for arg in args:
        result += arg
    return result

sum_args(1, 2, 3, init=0)

Default Values

Sometimes we want to make some parameters optional so that we can call the function with fewer arguments. To do this, we can specify default values for the parameters. For example, the following code makes the init parameter optional by giving it a default value of 0:

def sum_args(*args, init=0):
    result = init
    for arg in args:
        result += arg
    return result

sum_args(1, 2, 3), sum_args(1, 2, 3, init=10)

There is one thing to note here. Any optional positional parameter must be placed after all the positional parameters without default values. Using power() as an example, if we give the parameter exp a default value of 1, then the parameter base must be placed before it:

def power(base, exp=1):
    return base ** exp

power(7), power(7, exp=2)

Typing

The type system in Python is dynamic, which means that the type of variable can be changed at any time. However, it is sometimes useful to specify the type variable when we define it. This is called type hint. We just need to add a colon (:) after the variable name and type name, and the type checker will check the type of the variable. For example:

a: int = 1
b: float = 2.0
c: str = 'hello'

a, b, c

When we define a function, we can not only specify the type of parameters, but also the return type (using the -> operator). For example, if we expect that our sum_args() function accepts only integers as input and returns integers as output, we can specify the type as follows:

def sum_args(*args: int, init: int = 0) -> int:
    result = init
    for arg in args:
        result += arg
    return result

sum_args(1, 2, 3), sum_args(1, 2, 3, init=10)

Docstring

The docstring is a string that describes the purpose and usage of a function. It is placed immediately after the function definition, and is typically a multi-line string surrounded by triple quotes (preferred """). For example, we can add a docstring to the sum_args() function as follows:

def sum_args(*args: int, init: int = 0):
    """
    Return the sum of all positional arguments from an initial value.
    If `init` is not specified, it defaults to `0`.
    Only accepts integer arguments.
    """
    result = init
    for arg in args:
        result += arg
    return result

sum_args(1, 2, 3), sum_args(1, 2, 3, init=10)

Docstrings are stored as an attribute of the function object. You can access it using the help() function:

help(sum_args)

Decorators

Decorators are functions that take a function as input and return a function as output. They are used to modify the behavior of a function without changing its code. For example, the following code defines a decorator show_help() that prints out the docstring of the function it decorates:

def show_help(func):
    def wrapper(*args, **kwargs):
        help(func)
        return func(*args, **kwargs)
    return wrapper

show_help

We see that decorators are treated as normal functions in Python. To use a decorator, we can simply pass the function to be decorated as an argument to the decorator. For example, we can decorate the sum_args() function using show_help() as follows:

sum_args = show_help(sum_args)
sum_args(1, 2, 3)

It does show the docstring of power(), but it does not seem very useful. The correct use of decorators is to use the @ syntax to decorate a function when defining it. For example, the following code warps the Euclidean algorithm into a function euclidean_gcd(), and decorates it using show_help():

@show_help
def euclidean_gcd(m: int, n: int) -> int:
    """
    Calculate the greatest common divisor of two integers using Euclidean algorithm.
    For more details, see <https://en.wikipedia.org/wiki/Euclidean_algorithm>.
    """
    while m % n != 0:
        m, n = n, m % n
    return n

euclidean_gcd(18, 48)

Recursion

Recursion is a technique to solve complex problems by breaking them down into simpler subproblems of the same type. Now let’s look at a concrete example: the Fibonacci sequence. The first two numbers in the sequence are 0 and 1, and each subsequent number is the sum of the two previous numbers. For example, the first ten numbers in the sequence are 0,1,1,2,3,5,8,13,21,340, 1, 1, 2, 3, 5, 8, 13, 21, 34. Mathematically, we can define the Fibonacci sequence as follows:

F0=0F1=1Fn=Fn1+Fn2\begin{aligned} F_0 &= 0 \\ F_1 &= 1 \\ F_n &= F_{n-1} + F_{n-2} \end{aligned}

The following function implements the Fibonacci sequence using recursion:

def fibonacci(n: int) -> int:
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

fibonacci(10)

Note that a recursive function consists of two parts:

  • The base case: the termination condition of the recursion.

  • The recursive case: the process of solving the subproblem.

Using fibonacci() as an example, we can see that the base case is n <= 1. When this condition is met, the function terminates the recursion and returns a certain value. The recursive case is fibonacci(n - 1) + fibonacci(n - 2), which means that the function calls itself with n - 1 and n - 2 as arguments.

Exercise : The factorial of a positive integer nn is the product of all positive integers less than or equal to nn. For example, the factorial of 5 is 5×4×3×2×1=1205 \times 4 \times 3 \times 2 \times 1 = 120. Mathematically, we can derive the recursive formula for the factorial as follows:

n!=n×(n1)×(n2)××2×1=n×(n1)!\begin{aligned} n! &= n \times (n - 1) \times (n - 2) \times \cdots \times 2 \times 1 \\ &= n \times (n - 1)! \end{aligned}

Additionally, we can even specify that the factorial of 0 is 1, which is compatible with the recursive definition of the factorial. Write a function factorial() that takes a nonnegative integer n as input and returns the factorial of n, and then calculate the factorial of some n to verify your implementation.

Error Handling

Python has a built-in error handling mechanism. When an error occurs, the program will stop and print out the error message. The error message usually contains the following information:

  • The type of the error.

  • The line number where the error occurred.

  • The line of code where the error occurred.

  • A message describing details of the error.

For example, when we try to divide some numbers by zero, Python will raise a ZeroDivisionError:

1 / 0

When we try to access an index of a collection that is out of range, Python will raise a IndexError:

[0, 1, 2, 3, 4][5]

When we try to pass an argument of the wrong type to a function, Python will raise a TypeError:

range('1')

When we try to pass an argument of the right type but with an invalid value to a function, Python will raise a ValueError:

int('1.0')

When we try to access a variable that does not exist, Python will raise a NameError:

does_not_exist

When we try to access an attribute of an object that does not exist, Python will raise an AttributeError:

'does not exist'.does_not_exist

When we try to import a module that does not exist, Python will raise a ModuleNotFoundError:

import does_not_exist

When we try to open a file that does not exist, Python will raise a FileNotFoundError:

open('does_not_exist.txt')

When our code has invalid syntax, Python will raise a SyntaxError:

a =

The full list of built-in errors can be found here.

End-of-Lesson Problems

Problem 1: Time it!

Write a decorator that measures the execution time of a function. Your code should begin with the following lines:

import time

Now, define a function timeit() that takes a function as input and returns a function as output. The returned function should implement the logic in the following order:

  1. Record the start time using time.time(), which returns the current time in seconds since the epoch as a floating point number. More details can be found here.

  2. Call the input function.

  3. Record the end time using time.time().

  4. Calculate the execution time using the difference between the start and end time.

  5. Print out the execution time in a formatted string, where the time is in seconds with six decimal places.

  6. Return the result of the input function.

Define a test function sleep() that meets the following requirements:

  1. Use the timeit() decorator to time the execution of sleep().

  2. Take a single argument seconds and sleeps for that many seconds by pass it to time.sleep().

  3. Complete typing (take a float, return None) and docstring.

  4. Specify a default value for seconds.

Problem 2: Sieve of Eratosthenes

Write a function sieve() that takes a single argument n and returns a set of all prime numbers up to n. You should use the Sieve of Eratosthenes to generate the result. You should decorate the function with timeit(). Your function sieve() should have complete typing and docstring.

Check your implementation by calling sieve(100).

Acknowledgments

This lesson draws on ideas from the following sources: