This is the fifth and final homework assignment. Your goal here is to write two functions, one which reads data from a file and transforms it into a form convenient for further analysis, and another which uses these results and performs simple analysis on the original data.

Your solution will also be marked on code quality. This means that some portion of the marks will be given for good code organisation, variable/function naming, and commenting. The marks for code quality are distinct from those for functionality; to gain full marks, your submission must be both functionally correct and readable.

Practical information#

The assignment is due Saturday the 20th of May at 11.55pm (5 min before midnight, Canberra time). This is Saturday in Semester week 11, that is, the second last week of the semester. To submit your solution, you will upload a single Python file via Wattle. Here is the assignment submission link.

The homework is individual. You must write your own solution, and you are expected to be able to explain every aspect of it.

As usual, you should have followed recent weeks’ lectures and worked through the exercises in [lab 7], [lab 8], and [lab 9], before starting on the assignment. The assignment should not take more than a couple of hours to complete. Use of NumPy package is encouraged, but it is not required.

The problem#

The last homework should remind you the first homework. Here, we again are dealing with ASCII drawing. But instead of creating simple images, we will read map data which are presented in a text form with the help of ASCII symbols. Here is an example of a world map created with a command-line utility called asciiworld.

              ______________                 .              
         JLoooooo|""7ooooo|    '"""  "7, \___oooo___.__,    
L.|oooooooooooooLoo| oor".__   _oooooooooooooooooooooooooooo
 '`"L  7ooooooL_oooL,'"     .L'ooJoooooooooooooooooooo .J`  
       ''oooooooooooo,      "|ooooo7or"oooooooooooooo7,     
         7ooooooor          'rLoo""7oo oooooooooooro.r      
          |ooo""o,         .ooooooLooooooooooooooo||`       
   _,       7o_Jo|_,       oooooooooJoor 'oo"7ooo`|         
               "LLoL       7oooooooooo,   7|  oo`.o,        
                .ooooL      " '7ooooor`    " '7|_o/|        
                7ooooooo,      'ooooo          oooor/oLo_   
,                "ooooor       .oooor_|          "oJoJ/ '\ .
                  oooo"`        ooor o          |ooooooL " '
                 .oor"          'oo`             """"oor  .,
                 |or                                 "o`  J|
                 oo.,                    o               '` 
                 '"                                         
            ._   _JL        __________ooJ__JooooooooooL____ 
  ., Joooooooooooor.,_  _oooooooooooooooooooooooooooooooo`  
oooooJoooooooooooooooooooooooooooooooooooooooooooooooooooooo

As you can see it is a very crude map — it has a low resolution, which causes many important map features to be obscure, marred, or even wrong. Different symbols (r, o, L, _, ", J amongst others) are used to make up for a gross lack of resolution. Yet we shall use such ASCII maps as input with the aim to perform data analysis which has simple geographical meaning.

The same asciiworld utility can create a map in a different resolution. Below is the same map (rendered in the PNG format) with 4-times higher resolution:

The ASCII World at resolution 240x80.

The program which you will complete, should be able to work with an ASCII map of any resolution (though not too low when any semblance to the actual map of the world is lost).

Your task in this homework is to write two functions read_map(ascii_map) and get_spot(world_map, latitude, longitude). The read_map function will take one argument:

  1. ascii_map — the name of a text file, which contains a crude depiction of the world using the ASCII symbols; the file contains M rows of equal length N. All visible symbols represent the dry land area and only the whitespace (the ASCII code 32) represents the area of water (sea, ocean or big lake). The function must return a 2-tuple of values:

    • the map data in the form of a nested sequence, or a 2-dimensional Numpy array, in which the elements must be either 1 or 0. For element with the index (i, j) (i-th row, j-th column) the value 1 will be assigned if the map file contains a non-whitespace symbol on the row i and in the column j. If the map file contains the whitespace in that position, the corresponding element will be assigned 0.
    • a float which represents a fraction of the total map area occupied by dry land. This should be the area calculated in the projection coordinates, not the actual land area (which is different because of metric distortion that every projection contains).

The get_spot function will take three arguments:

  1. world_map — an object which is returned by the first function; the second function should be defined in a way that would work correctly regardless of whether world_map is a nested list of lists (sequence of sequences), or a Numpy 2-dimensional array. The length of all inner sequences (rows in the case of Numpy array) must be equal (this condition is satisfied automatically if the argument world_map is a Numpy array);
  2. latitude — a float in the range (+90.0,-90.0), representing the latitude in degrees like in standard geographical maps;
  3. longitude — a float in the range (-180.0,+180.0), representing the longitude in degrees like in standard geographical maps. For example, the ACT coordinates are (-33.7375,150.8468). The longitude of Greenwich (the home of Royal Observatory in London) is 0.0.

The second function get_spot will return one of the following strings:

  1. 'land', if the value world_map in the location (latitude,longitude) is 1 (“on land”), and all elements in its immediate neigbourhood are also 1;
  2. 'water', if the value world_map in the location (latitude,longitude) is 0 (“on water”), and all elements in its immediate neigbourhood of (i,j) are also 0;
  3. 'coast', if the value world_map in the location (latitude,longitude) is 1, but at least one value in its immediate neigbourhood is 0;
  4. 'seashore', if the value world_map in the location (latitude,longitude) is 0, but at least one value in its immediate neigbourhood is 1;

The immediate neighbourhood of a cell (i,j) consists of all cells with coordinates (i1, j1) for which max(abs(i1-i), abs(j1-j)) == 1 — all bordering cells, including the diagonals.

Note 1: the above rules should be used in the function implementation, but their order needs not be the same as listed.

Note 2: the world ASCII maps which we use in this study are approximate, and even at higher resolution they may not give you an accurate result. For example, using the online service Marineregions.org to explore geography of the biggest lake Caspian Sea, the coordinates 42 deg. N, 50 deg. E should correspond to the lake region, while the ASCII map built from data in the file world_mercator.txt by the function read_map, may say that it is land. The source of this error can be two-fold:

  1. the approximate nature of the map, and 2. The approximate calculation of the indices (i,j) from the continuous latitude and longitude. The correctness of your algorithms can be tested by using specially designed test data (map-like, but not real maps). For real world maps, we will use relaxed correctness criteria, which will effectively test that a region around a chosen point has the required property ('land', 'water' etc).

Assumptions, restrictions and hints:

  • the file whose name is passed for the parameter ascii_map for the function read_map is a text file all lines of which have equal length; the characters in this file are visible ASCII characters and the whitespace (the ASCII value 32);
  • the first parameter world_map of the function get_spot is a sequence of sequences of equal length, or a 2-dim Numpy array, the 2nd parameter is a float in the range (90.0, -90.0), and the 3rd parameter is a float in the range (-180.0, 180.0);
  • the function get_spot returns a string with one of the four values listed above;
  • because the map is created in the so called Mercator projection, in which Earth’s surface is drawn on a cylinder, some care is needed when dealing with the “end of the world” locations — the coordinates with latitude=90.0 and latitude=-90.0, which represent North and South poles, and the coordinates with longitude=180.0 and longitude=-180, which represent the same meridian.

Template and data files

As a starting point, we provide you with a skeleton code file: map_reader.py. Download this file and complete implementation of the two functions. You will also need to download the ascii map file world_mercator.txt.

Testing#

The skeleton file has testing functions: test_read_map, test_read_map_2 and test_get_spot. They provide partial correctness testing of the functions read_map and get_spot, respectively. The test inputs in these functions are included in the function code. You should not change the tests (assert statements) and the test data in these functions.

Remember that testing only checks a small number of predefined cases; it can never prove that your function works correctly for all valid arguments. You can examine the test cases that are provided, and think about whether there are any important ones that can be added to make the testing more reliable. This is optional because testing in this homework is more advanced, but you are not required to understand its details.

Note that you can define additional functions, if you think it helps you decompose the problem or write a better solution. Your additional function definitions should contain docstrings, but it would not be right to use triple-quoted strings as comments anywhere other than on the first line inside a function, or at the beginning of the file. New test functions (if you decide to add them) need not to contain docstrings.

Marking#

Code quality

In this homework (like in the previous ones) we will also be marking your submission for its code quality. This includes aspects such as:

  • Using good function, parameter and variable names.

    The names of some functions in the homework are fixed, but if you define additional functions (to decompose the problem) then they should be given descriptive names.

  • Appropriate use of comments and docstrings.

    This means not too few comments, but also not gratuitous, useless, incorrect or misleading commenting. Comments should be accurate, relevant, and readable. A docstring should appear as the first statement in every function definition.

  • Good code organisation.

    This includes appropriate use of functions to decompose a problem and avoid code repetition (“don’t repeat yourself”, or DRY, principle).

  • Choice of appropriate data structures.

    This includes the use of suitable representations of data which your program is operating on. In homework 5, the problem statement already defines a data structure for you to use. In more complex situations, when you define what functions your program will involve, and what data they will be using, the choice of data structures is one of the program design. And implementation, too, involves use of data structures which can make the program simpler, shorter and more readable.

To remind you again, do not import modules that you do not use.

What to submit

You should edit the skeleton file map_reader.py. Upload only this file with your implementation of the functions using the assignment submission link on Wattle.

Remember that you must upload a single Python code file. Do NOT zip or convert it to another format.

The file that you submit must meet the following requirements:

  • It must be syntactically correct Python code.
  • You additions to the file you downloaded, should contain only function definitions, and, optionally, import statements. However, you should import only those modules which you actually use. You may use any package or module included in Anaconda distribution. You may NOT use special geolocation Python packages which are not included in Anaconda to solve the problem. Anything that is not a function definition or import statement will be ignored when we test your submission. Note: The template file contains global constants (dummy_map_file and ` the_module`) which are needed for unit tests. You should keep them otherwise the tests will be broken. There are other assignment statements which define test data (they precede the test function definitions) — you should also leave them unchanged.

You will attend the following (after the due date) lab where you can discuss this problem with your tutor. This discussion is NOT part of the assessment. Questions which can be discussed are the following:

  • If the file has syntax errors, can you use the error messages from the interpreter or IDE to identify where the syntax errors are?
  • Does your submitted file meet the requirements stated above? Does it contain anything that is not a function definition? If so, can you point it out?
  • Does your implementation pass all the tests run by the unmodified testing function?
  • Is your implementation of the function correct for any valid argument?
  • Do your functions always return a value of the correct type?
  • Does the returning function also print during execution? Why are such print calls included (except that they were used for debugging during the development)? Is it right to define a function which both prints and returns? What is the difference between the print function call and the return statement?
  • Did you think of any other test cases that should be used to test your function, in addition to those provided?
  • What does it mean that a function is insensitive to the exact type of an input argument, eg, it can works equally well with a list, tuple or 1-dim array, or with a nested list of lists, or 2-dim array? What principles are used for creating code like this?

In marking this assignment we will consider the following:

  • Does your submitted file satisfy the requirements specified above?
  • Does your implementation compute the correct value for all valid arguments?
  • Does the quality of your submitted Python code, including its organisation, naming and documentation (with docstrings and comments) meet the code quality criteria.

The assignment is worth 4% of your final mark. 2 marks are given for the functional correctness, and 2 marks for the quality and readability of your code.

bars search times arrow-up