This is the fourth homework assignment. Your goal in this assignment is to write a function that performs a calculation on string data and returns a value. For this homework, like the previous one, we provide you with a testing framework, which will run tests of your function. It is important that you learn to use the testing program effectively, since we will be using this kind of automated testing for the exam.

Your solution to this homework will also be marked on code quality. This means some part of the marks will be given for good functional decomposition, variable/function naming, and commenting. The marks for code quality are distinct from those for functionality; to gain full marks, your submission must be both functional and readable.

Practical information#

The assignment is due 9.00am on Monday 26th April. To submit your solution, you will upload a single python file via wattle. Here is the assignment submission link.

In addition to submitting your solution, you must attend the following lab (in week 8). In the lab, your tutor will ask you some questions about your solution, and give you feedback if there is anything you need to improve. This discussion with the tutor is also part of the assessment.

If you fail to show up for the discussion with the tutor, you will receive zero marks for this assignment. If you do not submit a solution, you may still get partial marks for the discussion with the tutor.

The homework is individual. You must write your own solution, and you are expected to be able to explain every aspect of it.

As usual, you should have followed last week’s lectures and worked through the exercises in lab 4 and lab 5 before starting on the assignment. The assignment should not take more than one or two hours to complete.

The problem#

The relative frequency of a letter in a string is the number of times that the letter appears in the string divided by the total number of letters in the string. Note that this is always a number between 0.0 and 1.0.

Write a function max_relative_frequency which takes as argument a string and returns the highest relative frequency of any letter in the string. This value is unique. There can be several letters that occur with an equal (highest) frequency, but in that case their relative frequency is the same.

  • Consider only letters of the English alphabet (that is, 'a' through 'z').
  • For the purpose of counting occurrances, consider letters that differ only by case to be the same letter. For example, the letter 'n' occurs three times in the string "Non-even" (once in upper case, twice in lower case).
  • For the purpose of counting the total number of letters in the string, count only letters. For example, the string '0 << c++ << 9' contains only one letter, 'c', so the relative frequency of 'c' in this string is 1.0.
  • Similarly, do not count letters with accents or other ornamentation such as 'ë', 'ø' and 'å'.
  • If the string contains no letters, the relative frequency is undefined and your function should return 0.0.

Examples:

  • The string 'sufficit' has 8 letters, and the most frequent letters are f and i, which both occur twice. Thus, the highest relative frequency is 2/8.

  • The string 'Non-even' has 7 letters (- is not a letter), and the most frequent letter is n which occurs three times (one N and two n). Thus, the highest relative frequency is 3/7.

Assumptions and restrictions:

  • You can assume that the argument is a string.

  • Your function must return a float between 0.0 and 1.0 (inclusive).

As a starting point, we provide you with a skeleton code file: max_relative_frequency.py. Download this file and write in it your implementation of the function.

Using the testing program#

To use the testing program, you must first download the file:

Save it in the same directory as the file max_relative_frequency.py. To run the testing program, you just need to run homework_four_tests.py. The testing program will read the file max_relative_frequency.py and test the function max_relative_frequency defined in that file, and print out results of the tests. If any of your functions fails any of the tests, the program will print a detailed error message and stop.

Remember that the testing program will test the file named max_relative_frequency.py which is located in the same directory. If you change the name of the file with your implementation, the tests won’t work.

Marking#

What to submit

You should edit the skeleton file max_relative_frequency.py, then upload only this file with your implementations of the function using the assignment submission link on wattle. Do not rename this file. Do not edit (and do not try to upload) the testing program.

The file that you submit must meet the following requirements:

  • It must be syntatically correct python code.
  • It must be named max_relative_frequency.py
  • It must contain only function definitions and comments (module and function docstrings are accepted).
  • It may not import other modules.
  • Your function definitions should contain docstrings (as shown in the week 2 lecture).

As mentioned above, you must also attend the lab in week 8 and answer your tutor’s questions about your solution. This discussion is part of the assessment. You should be prepared to answer or demonstrate to the following questions:

  • Can you download the file that you submitted from wattle?
  • Can you run that file in the python interpreter (using an IDE of your choice) on the CSIT lab computer?
  • If the file has syntax errors, can you use the error messages from the interpreter or IDE to identify where the syntax errors are?
  • Does your submitted file meet the requirements stated above? Does it contain anything that is not a function definition or a comment? If so, can you point it out?
  • Can you download and run the testing program?
  • Does your implementation pass all the tests run by the testing program?
  • What is the type of the value returned by your function?
  • In order to find the relative frequency of each letter in the input you needed to iterate over the characters in the string (or you used some function or method that does). How many times does your solution iterate through the entire string? What is the smallest number of complete iterations over the string that is necessary to find the highest relative frequency?
  • Did you use any of python’s built-in string methods? If so, can you explain how you would write code to do what those methods do, if you had to produce a solution without using them?
  • What values do you store (in variables) while computing the answer? What data structure, if any, do you use to store them? Is there any other data structure that could be used instead?
  • Did you implement your solution with just one function, or divide it into several functions? Does the functional abstraction make your code easier to read and understand?
  • Are the functions in your submitted file documented? (with docstrings and/or comments). Does the function documentation adequately describe what the function does and what its assumptions and limitations are? Are the names of variables, parameters and auxiliary functions descriptive of their purpose?

In marking this assignment we will consider the following:

  • Does your submitted file satisfy the requirements specified above?
  • Submissions that do not meet the above requirements will not be marked.
  • Does your implementation compute the correct value for all input strings?
  • The quality of your submitted python code, including its organisation, naming and documentation.
  • Your ability to use the tools (e.g., the IDE or python interpreter), your understanding of python’s error messages, and your understanding of the solution, as demonstrated in your discussion with the tutor.

The assignment is worth 4% of your final mark. 3 marks are based on the functionality of your submission; 1 of the 4 is based on the readability and quality of your code.

bars search times arrow-up