Lists are:

  • Ordered collections of arbitrary objects
  • Accessed by offset
  • Variable-length, heterogeneous, and arbitrarily nestable
  • Of the category “mutable sequence”
  • Arrays of object references

List is a sequence

Like a string, a list is a sequence of values. In a string, the values are characters; in a list, they can be any type. The values in a list are called elements or sometimes items.

To create a new list:

[10, 20, 30, 40]
['New England Patriots', 'Buffalo Bills', 'Miami Dolphins', 'New York Jets']
# A list within another list is nested. 
['spam', 2.0, 5, [10, 20]] 
AFC_east = ['New England Patriots', 'Buffalo Bills', 'Miami Dolphins', 'New York Jets']
numbers = [42, 123]
empty = []
print(AFC_east, numbers, empty)

Lists are mutable

The syntax for accessing the elements of a list is the same as for accessing the characters of a string—the bracket operator.

AFC_east[3] = 'New York Giants'
print(AFC_east)

If you still remember, strings are immutable.

List indices work the same way as string indices:

  • Any integer expression can be used as an index.
  • If you try to read or write an element that does not exist, you get an IndexError.
  • If an index has a negative value, it counts backward from the end of the list.

The in operator also works on lists.

AFC_east = ['New England Patriots', 'Buffalo Bills', 'Miami Dolphins', 'New York Jets']
print('Buffalo Bills' in AFC_east)

Exercise 01

What is the index of 'Apple'? 'Lisa'? 'On Rail'?

L = [
    ['Apple', 'Google', 'Microsoft'],
    ['Java', 'Python', ['Ruby', 'On Rail'], 'PHP'],
    ['Adam', 'Bart', 'Lisa']    
]

Traversing a list

for team in AFC_east:
    print(team)
numbers = [2, 0, 1, 6, 9]

for i in range(len(numbers)):
    numbers[i] = numbers[i] * 2
    
print(numbers)

Q. What is the length of the following list?

my_list = ['spam', 1, ['New England Patriots', \
                       'Buffalo Bills', 'Miami Dolphins', \
                       'New York Giants'], \
           [1, 2, 3]]
print(len(my_list))

List operations

a = [1, 2, 3]
b = [4, 5, 6]
c = a + b
print(c)
[0] * 4
['Tom Brady', 'Bill Belichick'] * 3

List slices

The slice operator is the same as the one for strings.

t = ['a', 'b', 'c', 'd', 'e', 'f']

Q. How do we get ['b', 'c']? ['a', 'b', 'c', 'd']? ['d', 'e', 'f'] ? The entire list?

t = ['a', 'b', 'c', 'd', 'e', 'f']
t[1:3] = ['x', 'y']
print(t)

List methods

Exercise 02

Read the documentation of the list methods. You might want to experiment with some of them to make sure you understand how they work. append, extend and sort are particularly useful.

t = ['a', 'b', 'c']
t.append('d')

Exercise 3

Finish list_exercises.py. (You can find the Python file in resources/code on GitHub.)

map, reduce and filter

Sometimes you want to traverse one list while building another. For example, the following function takes a list of strings and returns a new list that contains capitalized strings:

def capitalize_all(t):
    res = []
    for s in t:
        res.append(s.capitalize())
    return res

An operation like capitalize_all is sometimes called a map because it “maps” a function (in this case the method capitalize) onto each of the elements in a sequence.

t = [1, 2, 3]
print(sum(t))

An operation like sum that combines a sequence of elements into a single value is sometimes called reduce.

Another common operation is to select some of the elements from a list and return a sublist. For example, the following function, only_upper, takes a list of strings and returns a list that contains only the uppercase strings:

def only_upper(t):
    res = []
    for s in t:
        if s.isupper():
            res.append(s)
    return res

isupper is a string method that returns True if the string contains only upper case letters.

An operation like only_upper is called a filter because it selects some of the elements and filters out the others.

Deleting elements

t = ['a', 'b', 'c', 'd']
x = t.pop(1)
# pop modifies the list and returns 
# the element that was removed. 
print(x)
print(t)
x = t.pop()
# If you don’t provide an index, 
# it deletes and returns the last element.
print(x)
print(t)

Another way to remove an item or items from a list: del

https://docs.python.org/3.9/tutorial/datastructures.html#the-del-statement

t = ['a', 'b', 'c', 'd', 'e']
del t[1:3]
print(t)

If you know the element you want to remove (but not the index), you can use remove:

t = ['a', 'b', 'c']
t.remove('b')
print(t)

Lists and strings

To convert from a string to a list of characters, you can use list:

team = 'Patriots'
t = list(team)
print(t)

Because list is the name of a built-in function, you should avoid using it as a variable name. I also avoid l because it looks too much like 1.

If you want to break a string into words, you can use the split method:

team = 'New England Patriots'
t = team.split()
print(t)

An optional argument called a delimiter specifies which characters to use as word boundaries. The following example uses a hyphen as a delimiter:

s = 'spam-spam-spam'
delimiter = '-'
t = s.split(delimiter)
print(t)

join is the inverse of split.

t = ['New', 'England', 'Patriots']
team = ' '.join(t)
print(team)

Q. How do we get 'New!!England!!Patriots'?

Objects and values

a = 'banana'
b = 'banana'

How do we know whether they refer to the same string?

a is b

Let's checkout the following two lists.

a = [1, 2, 3]
b = [1, 2, 3]
a is b

In this case we would say that the two lists are equivalent, because they have the same elements, but not identical, because they are not the same object.

If a refers to an object and you assign b = a, then both variables refer to the same object:

a = [1, 2, 3]
b = a
b is a

The association of a variable with an object is called a reference. In this example, there are two references to the same object.

An object with more than one reference has more than one name, so we say that the object is aliased.

b[0] = 'something else'
print(a)

Common mistakes when dealing with lists

1, Most list methods modify the argument and return None.
t = t.sort()           # WRONG!
2, Pick an idiom and stick with it.

Part of the problem with lists is that there are too many ways to do things. For example, to remove an element from a list, you can use pop, remove, del, or even a slice assignment.

To add an element, you can use the append method or the + operator. Assuming that t is a list and x is a list element, these are correct:

t.append(x)
t = t + [x]
t += [x]

t.append([x])          # WRONG!
t = t.append(x)        # WRONG!
t + [x]                # WRONG!
t = t + x              # WRONG!
3, Make copies to avoid aliasing.

If you want to use a method like sort that modifies the argument, but you need to keep the original list as well, you can make a copy.

t = [3, 1, 2]
t2 = t[:]
t2.sort()
print(t)
print(t2)

Exercise 04

1. Finish binary_search.py - use bisection search to find out the index of a given number. Please refer to slides and previous code about bisection search. Below is a summary of the idea of bisection search

Because the words are in alphabetical order, we can speed things up with a bisection search (also known as binary search), which is similar to what you do when you look a word up in the dictionary. You start in the middle and check to see whether the word you are looking for comes before the word in the middle of the list. If so, then you search the first half of the list the same way. Otherwise you search the second half.

2. (Optional) Two words “interlock” if taking alternating letters from each forms a new word. For example, “shoe” and “cold” interlock to form “schooled.” Finish interlock.py. (You may need to finish inlist.py first.)

  1. Write a program that finds all pairs of words that interlock. Hint:don’t enumerate all pairs!2. Can you find any words that are three-way interlocked; that is, every third letter forms a word, starting from the first, second or third?