Lab 4

Note: This lab has more programming problems than we expect everyone to finish during the lab time. If you do not have time to finish all problems, you can continue working on them later (at home, if you have a computer with python set up, in the CSIT lab rooms outside teaching hours, or on one of the computers available across campus), or return to them in a later lab.

Objectives#

The purpose of this week’s lab is to:

  • understand the indexing of (1-dimensional) sequences;
  • do some computations over sequences that require iterating through them using loops; and
  • practice reading and debugging code.

Exercise 0: Reading and debugging code#

The following are attempts to define a function that takes three (numeric) arguments and checks if any one of them is equal to the sum of the other two. For example, any_one_is_sum(1, 3, 2) should return True (because 3 == 1 + 2), while any_one_is_sum(0, 1, 2) should return False.

(a) All of the functions below are incorrect. For each of them, find examples of arguments that cause it to return the wrong answer.

Function 1#

def any_one_is_sum(a,b,c):
    sum_c=a+b
    sum_b=a+c
    sum_a=b+c
    if sum_c == a+b:
        if sum_b == c+a:
            if sum_a == b+c:
                return True
    else:
        return False

Function 2#

def any_one_is_sum(a,b,c):
    if b + c == a:
        print(True)
    if c + b == a:
        print(True)
    else:
        print(False)
    return False

Function 3#

def any_one_is_sum(a, b, c):
    if a+b==c and a+c==b:
        return True
   else:
        return False

(b) For each of the three functions above, can you work out how they are intended to work? That is, what was the idea of the programmer who wrote them? What comments would be useful to add to explain the thinking? Is it possible to fix them by making only a small change to each function?

Exercise 1: Debugging loops#

The following are two attempts to define a function that computes a sum. (The sum is an approximation of the integral of the function 1/x over the interval from lower to upper. The bugs in each function below are unrelated to this particular function.) Each function uses a while loop that may never terminate. The loop may terminate for some arguments, but not for others.

For each of the functions, find arguments that cause the loop to terminate as well as arguments that cause it to not terminate. The arguments should all be numbers, lower should be less than upper, and nterms should be a positive integer (not zero).

Hint: Add print calls inside the loop to see what is happening. Print the variables that appear in the loop condition, so you can see if they are changing or not (if they are not, then the loop is stuck).

Function 1#

def integrate(lower, upper, nterms):
    # divide the interval into nterms even-sized parts
    delta = (upper - lower) / nterms
    total = 0
    while lower+delta >= upper:
        # compute area from lower to lower + delta
        area = ((1/lower) + (1/(lower + delta))) * delta / 2
	# add to total area
        total = total + area
        lower = lower + delta
    return total

Function 2#

def integrate(lower, upper, nterms):
    # divide the interval into nterms even-sized parts
    delta = (upper - lower) / nterms
    total = 0
    while lower < upper:
        # compute area from lower to lower + delta
        area = ((1/lower) + (1/(lower + delta))) * delta / 2
	# add to total area
        total = total + area
        delta = (upper - lower) / nterms
        lower = lower + delta
    return total

Sequence types#

We have already seen a number of times that all values in python have a type, such as int, float, str, etc. To determine the type of a value we can use the function type(_some expression_). python has three built-in sequence types: lists (type list), strings (type str) and tuples (type tuple). These sequence types are used to represent different kinds of ordered collections. In this lab, we will only use lists, and we will only see a few of the many things that can be done with them. We will return to lists again after the break to examine them in more detail.

To write a list literal, write its elements, separated by commas, in a pair of square brackets:

In [1]: x = [1, 2, 3, 4, 5, 6]

In [2]: type(x)
Out [2]: ...

The elements that you write can be expressions. These are evaluated, and the resulting values become the elements of the list:

In [3]: x = [2, 2 + 1, 2 * 2, 2 + 3]

In [2]: x
Out [2]: ...

Indexing sequences#

Every element in a sequence has an index (position). The first element is at index 0. The length of a sequence is the number of elements in the sequence. The index of the last element is the length minus one. The built-in function len returns the length of any sequence.

Indexing a sequence selects a single element from the sequence (for example, a character if the sequence is a string). Python also allows indexing sequences from the end, using negative indices. That is, -1 also refers to the last element in the sequence, and -len(seq) refers to the first.

Exercise 2#

Execute the following in the python shell. For each expression, try to work out what the output will be before you evaluate the expression.

In [1]: my_list =  [1, 2, 3, 4, 5, 6]

In [2]: my_list[1]
Out [2]: ...

In [3]: my_list[4]
Out [3]: ...

In [4]: my_list[-1]
Out [4]: ...

In [5]: L = len(my_list)

In [6]: my_list[L - 1]
Out [6]: ...

In [7]: my_list[1 - L]
Out [7]: ...

They should all run without error. Is the result of each expression what you expected?

Iteration over sequences#

Python has two kinds of loop statements: the while loop, which repeatedly executes a suite as long as a condition is true, and the for loop, which executes a suite once for every element of a sequence. (To be precise, the for loop works not only on sequences but on any type that is iterable. All sequences are iterable, but later in the course we will see examples of types that are iterable but not sequences.)

Both kinds of loop can be used to iterate over a sequence. Which one is most appropriate to implement some function depends on what the function needs to do with the sequence. The for loop is simpler to use, but only allows you to look at one element at a time. The while loop is more complex to use (you must initialise and update an index variable, and specify the loop condition correctly) but allows you greater flexibility; for example, you can skip elements in the sequence (increment the index by more than one) or look at elements in more than one position in each iteration.

In the lectures so far, we have only used while loops, and they are sufficient to solve all the problems in this lab. We will introduce the for loop next week. However, if you want to try using for loops, their syntax and execution is described in the text books (Downey: Section “Traversal with a for loop” in Chapter 8; Punch & Enbody: Sections 2.1.4 and 2.2.13).

Exercise 3#

The following function takes one argument, which should be sequence of numbers, and computes the average of the numbers in the list:

def average(numbers):
    total = 0
    index = 0
    while index < len(numbers):
        total = total + numbers[index]
        index = index + 1
    return total / len(numbers)

Test the function with some example inputs. For example

In [1]: average([1, 2, 3, 4, 5])
Out [1]: ...

In [2]: average([3, 4, 3, 1])
Out [2]: ...

Note that the value returned is always a floating point number (type float) even if the average happens to be an integer.

(a) python has a few built-in functions that work on sequences (of any type): min(seq), max(seq) and sum(seq) all do what you would expect them to. (Note, however, that sum only works on sequences that contain only numbers.) The function sorted returns a list with the elements of the argument sequence in sorted order.

Write a new version of the averaging function that uses sum.

(b) Write a function most_average(numbers) which finds and returns the number in the input that is closest to the average of the numbers. (You can assume that the argument is a sequence of numbers.) By closest, we mean the one that has the smallest absolute difference from the average. You can use the built-in function abs to find the absolute value of a difference. For example, most_average([1, 2, 3, 4, 5]) should return 3 (the average of the numbers in the list is 3.0, and 3 is clearly closest to this). most_average([3, 4, 3, 1]) should also return 3 (the average is 2.75, and 3 is closer to 2.75 than is any other number in the list).

You can use the function above, or the one you just wrote, to compute the average.

Exercise 3(c)#

Write a function count_negative(numbers) that takes as argument a sequence of numbers and returns the number of negative numbers in the sequence. For example, count_negative([-2, -1, 0, 1, 2]) and count_negative([2, -2, 1, -1, 0]) should both return 2, since there are two negative numbers in each argument list. count_negative([5, 0, 9]) should return 0, as there are no negative numbers in the input list.

Also test your function on an empty list (that is, a list with no elements). An empty list can be created with the expression [] or list(). Does your function work?

Exercise 4#

A sequence of numbers is said to be (non-strictly) increasing if each element is less than or equal to the next element in the sequence. For example [1, 5, 9] and [3, 3, 4] are both increasing, but [3, 4, 2] is not.

Here are two function that are meant to return True if the argument sequence is increasing, and False otherwise. However, both functions have bugs.

Function 1#

def is_increasing(seq):
    i = 0
    while i < len(seq):
       if seq[i + 1] < seq[i]:
           return False
       i = i + 1
    return True

Function 2#

def is_increasing(seq):
    i = len(seq) - 1
    while i >= 0:
       if seq[i] < seq[i - 1]:
           return False
       i = i - 1
    return True

(a) For each of the two functions, find, if possible, an example of an argument sequence that cause a runtime error due to a list index being out of range. (This may or may not be possible for both functions.)

(b) For each of the two functions, find, if possible, examples of arguments that do not cause a runtime error, but makes the function return the wrong value. (This may or may not be possible for both functions.)

(c) Write a correct function that determines if an argument sequence is increasing.

List operations#

Python allows two operators to applied to lists: + and *. + applied to lists means concatenation. That is, if A and B are lists, then A + B is a list whose elements are all those in A followed by all those in B. For example,

In [1]: [1, 2, 3] + [7, 8]
Out [1]: [1, 2, 3, 7, 8]

* applied to a list and an integer creates repetition of the elements of the list that many times. For example:

In [2]: [1, 2, 3] * 2
Out [2]: [1, 2, 3, 1, 2, 3]

Programming problems#

Note: These are more substantial programming problems. We do not expect that everyone will finish all of them during the lab time. If you do not have time to finish them during the lab, you can continue working on them later (at home, in the CSIT labs after teaching hours, or on one of the computers available in the university libraries or other teaching spaces), or come back to them later in the course.

Closest matches#

(a) Write two functions, smallest_greater(seq, value) and greatest_smaller(seq, value), that take as argument a sequence and a value, and find the smallest element in the sequence that is greater than or equal to the given value, and the greatest element in the sequence that is smaller than or equal to the given value, respectively.

For example, if the sequence is `[3, 1, 13, 5, 9] and the target value is ‘6’, the smallest greater element is 9 and the greatest smaller element is 5.

  • You can assume that all elements in the sequence are of the same type as the target value (that is, if the sequence is a list of numbers, then the target value is a number).
  • You should not assume that the elements of the sequence are in any particular order.
  • You should only use operations on the sequence that are valid for all sequence types.
  • What happens in your functions if the target value is equal to one of the elements in the sequence?
  • What happens in your functions if the target value is smaller or greater than all elements in the sequence?

(b) Same as above, but assume the elements in the sequence are sorted in increasing order; can you find an algorithm that is more efficient in this case?

Counting duplicates#

If the same value appears more than once in a sequence, we say that all copies of it except the first are duplicates. For example, in [-1, 2, 4, 2, 0, 4], the second 2 and second 4 are duplicates.

Write a function count_duplicates(seq) that takes as argument a sequence and returns the number of duplicate elements (for example, it should return 2 for the sequence above). Your function should work on any sequence type (for example, both lists and strings), so use only operations that are common to all sequence types (such as indexing and checking the length). For the purpose of deciding if an element is a duplicate, use standard equality, that is, the == operator.

Putting stuff in bins#

A histogram is way of summarising (1-dimensional) data that is often used in descriptive statistics. Given a sequence of values, the range of values (from smallest to greatest) is divided into a number of sections (called “bins”) and the number of values that fall into each bin is counted. For example, if the sequence is [2.09, 0.5, 3.48, 1.44, 5.2, 2.86, 2.62, 6.31], and we make three bins by placing the dividing lines at 2 and 4, the resulting counts (that is, the histogram) will be the sequence 2, 4, 2, because there are 2 elements less than 2, 4 elements between 2 and 4, and 2 elements > 4.

(a) Write a function count_in_bin(values, lower, upper) that takes as argument a sequence of numbers and two values that define the lower and upper sides of a bin, and counts the number of elements in the sequence that fall into this bin. You should treat the bin interval as open on the lower end and closed on the upper end; that is, use a strict comparison lower < element for the lower end and a non-strict comparison element <= upper for the upper end.

(b) Write a function histogram(values, dividers) that takes as argument a sequence of values and a sequence of bin dividers, and returns the histogram as a sequence of a suitable type (say, a list) with the counts in each bin. The number of bins is the number of dividers + 1; the first bin has no lower limit and the last bin has no upper limit. As in (a), elements that are equal to one of the dividers are counted in the bin below.

For example, suppose the sequence of values is the numbers 1,..,10 and the bin dividers are [2, 5, 7]; the histogram should be [2, 3, 2, 3] (because 1, 2 are in the first bin, 3, 4, 5 in the second, 6, 7 in the third and 8, 9, 10 in the last bin).

To implement this function, you may need to grow the size of the list that represents the histogram. You can do this with the list concatenation operator: if mylist is a list (which can be empty) and x is a value (number) that you want to add to it, then the assignment mylist = mylist + [x] adds x to the end of mylist.

To test your function, you can create arrays of random values using NumPy’s random module:

In [1]: import numpy.random as rnd
In [2]: values = rnd.normal(0, 1, 50)

This creates an array of 50 numbers drawn according to the normal distribution with mean 0 and standard deviation 1. The following creates 10 evenly sized bins covering the range of values:

In [1]: import numpy as np
In [2]: histogram_range = np.max(values) - np.min(values)
In [3]: dividers = (np.arange(1, 10) * (histogram_range / 10)) + np.min(values)

As you increase the size of the value array, you should find that the histogram becomes more symmetrical and more even.

You can also test your function by comparing it with the histogram function provided by NumPy (see help(numpy.histogram)).

bars search times arrow-up