Assignment P3#

Applies to:COMP6710
Released by:Monday, 31 March 2025, 09:00
Base assignment deadline:Friday, 11 April 2025, 15:00
Code Walk registration deadline:Friday, 11 April 2025, 18:00
Code Walk registration link:CWAC
Code Walks:
Thursday, 17 April 2025
Tuesday, 22 April 2025
Wednesday, 23 April 2025
GitLab template repository:(link)
GitLab CI file:.gitlab-ci.yml
Minimum Library Version:2025S1-7
Last Updated:Friday, 11 April 2025, 07:30

This assignment uses regular Java 23, and JUnit Jupiter 5.9.0. You may optionally import the Functional Java Standard Libraries. Do not import other libraries into your project. Your code may not:

  • Explicitly throw exceptions.
  • Catch exceptions outside of the allowed pattern for reading/writing files (see Slides of Workshop 6B).
  • Use concurrency (if you do not know what that is, you will not be using it so long as you write the code on your own).
  • Use reflection (if you do not know what that is, you will not be using it so long as you write the code on your own).
  • Interact with the environment outside of the file system operations described on the Slides of Workshop 6B.
  • Write to or read from files other than described in the assignment.

In everything you do, you must follow the Design Recipe. The minor adjustments to the Design Recipe for class definitions are available in Slides of Workshop 6B To see roughly how this assignment will be marked, check out the Skeleton Rubric.

You still need to follow the Design Recipe and write appropriate documentation and tests. In particular, the (minor) adjustments to the Design Recipe for class definitions were covered in workshop 6B. Do not write code that is overly complicated.

Formalities#

This assignment is submitted via GitLab.

Access to Your Repository#

In order for us to be able to collect and mark it, you need to satisfy the following conditions:

  • Your repository’s name must be comp6710-2025s1-p3 (exactly), placed directly in your user namespace (the default). That is, if your uid is u1234567, the address of your GitLab project should read as https://gitlab.cecs.anu.edu.au/u1234567/comp6710-2025s1-p3.
  • Your repository must have our marker bot (comp1110-2025-s1-marker) added as a Maintainer.

You can achieve both these points by creating your project through forking our template repository and not changing the name, or by just using the correct name when creating the project and then manually adding the bot.

Notebooks and Push Requirement#

You need to keep a notebook file as explained on the Git and Notebooks page. You need to commit and push the current state of your files:

  • at the end of every session where you work on the assignment
  • at least once per hour (our checks implement this that no two pushes during sessions recorded in the notebook are more than 70 minutes apart).
  • at least once per major part of the assignment (i.e. for this assignment, at least four times). You are absolutely free to commit and/or push more often than that.

You need to write informative commit messages, and informative notes for each session in your notebook, which describe what it is that you did for this particular commit or session, respectively.

Statement of Originality#

Your submission (and therefore your git repository) needs to include a signed statement of originality. This must be a text file named soo.txt. That file must contain the following text, with the placeholders [your name here] and [your UID here] replaced with your name and UID (remove the brackets!), respectively:

I declare that this work upholds the principles of academic
integrity, as defined in the University Academic Misconduct
Rule; is entirely my own work; is produced for the purposes
of this assessment task and has not been submitted for
assessment in any other context, except where authorised in
writing by the course convener; gives appropriate
acknowledgement of the ideas, scholarship and intellectual
property of others insofar as these have been used; in no part
involves copying, cheating, collusion, fabrication, plagiarism
or recycling.

I declare that I have not used any form of generative AI in
preparing this work.

I declare that I have the rights to use and submit any assets
included in this work.

[your name here]
[your UID here]

Files Summary#

Your repository must be accessible to us and must have the following structure:

  • notebook.yml - your Notebook
  • soo.txt - your signed Statement of Originality
  • src - a folder containing your source files:
    • Lister.java - containing the Lister class for part 1
    • a .java file for your Ancestor data type for part 1
    • TextFile.java - containing your work for part 2
    • a .java file for your Word count entries for part 2
    • Finder.java - containing your work for part 3
    • Matcher.java - containing your work for part 4
    • a .java for your Match data type for part 4
  • tests - a folder containing your test files:
    • ListerTests.java - containing your tests for part 1
    • TextFileTests.java - containing your tests for part 2
    • FinderTests.java - containing your tests for part 3
    • MatcherTests.java - containing your tests for part 4

Note: tests related your extra data types for the different parts can just go with the other tests for the main class of that part.

Project Configuration and Running Programs#

We recommend that you use IntelliJ IDEA for this assignment, however, you do not have to. If you do, the standard settings that result from following the IntelliJ Project Setup Guide should make your project work with our testing environment.

If you do not want to (or cannot) use IntelliJ IDEA, or if you want to make sure you can run your program like our automated testing framework, then you can follow the guide available here. This will allow you to trigger your tests directly from the console, in a similar way we did in the first half of the course with the functional Java library support for tests.

Code-Walk Registration#

Please use CWAC to register for a code walk. You can also use it to give time preferences for scheduling your code walk.

The deadline for registering to participate in code walks is Friday, 11 April 2025, 18:00. You need to do this in order for your assignment submission to get marked!

Info: Arrays of Arrays (a.k.a. jagged arrays)#

Remember: an array type is written as any type plus [] immediately after it.

This includes array types themselves. So int[][] is an array of arrays of integers (a.k.a. jagged array of integers).

This lets you write code like this:

int[][] twoDInt = {{1,2,3,4,5},{6,7,8},{9,10}};
int[] row = twoDInt[0];
int i = row[4];
boolean thisIsTrue = i == twoDInt[0][4];

In the last line, you can see how you can access a particular element of an array of arrays directly, by just using the array indices twice one after another - first into the array of arrays (index 0), then into one of those arrays (index 4).

There are mainly two possible ways to create such arrays. First, you can write, e.g.,

int[][] twoDInt = new int[5][10];

a rectangular array of arrays of five integer arrays, each of which of length ten.

However, not all the arrays within an array of arrays have to have the same length (as shown in the example above), or have to be created from the start. In particular, one can also write:

int[][] twoDInt = new int[5][];

This creates a new array of five arrays of integers, but all five components are initialized to null. You have to now set each index of the array to a concrete array of integers - each of those arrays may have a different size. For example:

twoDInt[0] = new int[3];
twoDInt[1] = new int[4];
...

Info: Java Standard Libraries#

Part of this assignment asks you to do things for which there are operations in the Java Standard Libraries. Consider particularly looking at the following pages:

Tasks#

For the purposes of this assignment, you can assume to always be working with file systems where every file you can find is either a directory or a regular file, that is, exactly one of File::isDirectory or File::isFile will return true.

Pre-Assignment Exercise#

This exercise is purely optional, but doing it may come in handy later in the assignment, especially if you feel more comfortable with the Functional Java Standard Libraries.

In a file called src/Util.java

Create a class called Util.

In it, design the following class static method.

/**
 * Returns an array that has the same length as the given list,
 * and contains the elements of the list in reverse order.
 * Examples:
 * - Given: Nil()
 *   Expect: a String array of length 0
 * - Given: Cons("hello", Cons("world!", Nil()))
 *   Expect: a String array arr of length 2,
             arr[0] = "world!", arr[1] = "hello"
 */
static String[] consListToReverseArray(ConsList<String> list)

Follow the Design Recipe!

You may also consider writing a version of this method that does not reverse the order of elements, or versions that work on other kinds of lists or arrays, as might be required for the rest of the assignment tasks. For reasons that are somewhat unique to Java and Java-based languages, it is not easy to write a generic version of this method. Unless you already know how about this, don’t try to do it!

Note: it is easy to transform an array into a ConsList - MakeList does that!

Part 1#

In a file called src/Lister.java

Part 1 is a regular part of the assignment and will be included in marking it. However, it is designed as an on-ramp. Tutors may help you with the code here if you ask for help - though you should still try to do as much as possible yourself.

Create a class called Lister.

You must design at least the following members, as these are the ones that we will subject to automatic testing:

// To distinguish from array syntax, we will will write
// |A| for a data type that you should design here, instead of the
// usual [A]
// Here, File refers to java.io.File.
// Make sure to use to right import!

/**
 * Assuming that the given File object is a directory,
 * and that it exists,
 * returns an array of Strings that contains the names of
 * all (non-directory) files that are contained within the
 * subtree rooted at the given directory. That is, it will
 * contain the names of the files within the given directory,
 * the names of the files contained in all directories
 * contained within the given directory, and so on. The order
 * of the names in the output array of Strings is undefined
 * (i.e., it does not matter).
 * Example:
 * - Given:  a File object associated to the root of the
 *           project directory.
 *   Expect: an array containing "notebook.yml", "soo.txt",
 *           "Lister.java", "ListerTests.java", and all the
 *           other (non-directory) files in the subtree rooted
 *           at the root of the project directory.
 */
static String[] getFiles(File directory)

/**
 * Assuming that the given File object is associated to a directory,
 * and that it exists,
 * returns an array of objects of type |A| containing all ancestor
 * directories of the given directory. The entries in the array
 * must be ordered in increasing order by the distance to the
 * root directory (i.e., "/").
 * Example:
 * - Given: File("/home/users/u1234567")
 *   Expect: array arr of size 2
 *           arr[0] = an |A| representing "/home"
 *           arr[1] = an |A| representing "/home/users"
 * In this example, "home" is at distance 1 to the root directory,
 * while "users" is at distance 2.
 */
static |A|[] getAncestors(File directory)

|A| is a data type representing an ancestor of a given directory, in its own file |A|.java (do not put the | character in a file name or class name!). It should have at least the following members (provided that they do not contain any non-trivial code, you do not need to write tests for these methods):

/**
 * The name of the ancestor.
 * For example, if the ancestor is "/home", this returns "home".
 * If the ancestor is "/home/users", this returns "users".
 */
String name()
/**
 * The distance of this ancestor to the root directory
 * For example, if the ancestor is "/home", this returns 1.
 * If the ancestor is "/home/users", this returns 2.
 */
int distance()

Follow the Design Recipe!

For this part of the assignment, you may decide to ignore all files and directories whose name starts with “.”, or you may include all of them in your results. Be consistent in your choice!

Part 2#

In a file called src/TextFile.java

Create a class called TextFile.

You must design at least the following members, as these are the ones that we will subject to automatic testing:

// To distinguish from array syntax, we will will write
// |W| for a data type that you should design here, instead of the
// usual [W]

// Here, File refers to java.io.File.
// Make sure to use to right import!

// For the purposes of this class, lines are strings
// of characters separated by newline characters
// (i.e., '\n'). A line can be further split into words,
// which are sequences of characters of a line separated
// by one or more space characters. For the purpose of
// this assigment, words are considered to be case-sensitive,
// e.g., "Hello", "hello" and "HELLO" are considered to be
// different words, and they may contain characters different
// from letters, e.g., numbers or punctuaction signs. For
// example, a line such as "Hello, world!", has two words,
// i.e., "Hello," and "world!".

/**
 * Given a reference to a File object associated to a
 * text file that is assumed to exist in the file system,
 * loads the file's contents and makes them accessible via
 * further method calls. That is, the constructor reads the file
 * contents, and stores them in one or more fields of suitable types
 * in this class. You should decide how many fields, and the types
 * of these fields.
 */
TextFile(File f)

/**
 * Returns the number of lines in the file.
 * If the file is empty, return 0.
 */
int getLineCount()

/**
 * Returns the number of words in a given line
 * representing by its index within the file.
 * Line indices are 0-based. Assume a valid index.
 */
int getNumWordsInLine(int line)

/**
 * Returns the word at the given index in the given line.
 * Line indices are 0-based. Word indices are 0-based, and
 * here, relative to the line, i.e. index 0 is the first word
 * in the given line. Assume valid indices.
 */
String getWordAt(int line, int wordIndex)

/**
 * Returns the word at the given index in the file.
 * Word indices are 0-based, and here, relative
 * to the file overall.
 * Assume a valid index.
 */
String getWord(int wordIndex)

/**
 * Returns how often a given word occurs in the file.
 * If the word does not occur in the file, it must return 0.
 */
int getWordCount(String word)

/**
 * Returns the file argument with which the TextFile
 * was created.
 */
File getFile()

/**
 * Returns an array of objects of type |W| containing
 * information about the k-th most common words in the file,
 * with k being an integer number larger or equal than one.
 * The order of the objects in the array
 * should be in decreasing order by word count. Thus the first
 * element of the array contains the most-common word, the
 * second element the second most-common word, etc., till the
 * k-th element, that contains the k-th most common word. If two or
 * more words have the same word count, then the order in the
 * output array does not matter. If the file has less than k
 * words, then the resulting array is of length the number of words
 * in the file, otherwise the length of the array is k. If the file
 * is empty, then the array is empty.
 */
|W|[] mostKthCommonWords(int k)

|W| is a data type representing word-count entries that you should design, in its own file |W|.java (do not put the | character in a file name or class name!). It should have at least the following members (provided that they do not contain any non-trivial code, you do not need to write tests for these methods):

/**
 * The word the word-count entry represents.
 */
String getWord()
/**
 * The count of this word in the text file that created this entry.
 */
int getCount()

Follow the Design Recipe!

Part 3#

In a file called src/Finder.java

Create a class called Finder.

You must design at least the following members, as these are the ones that we will subject to automatic testing:

/**
 * Returns a list of TextFile objects corresponding to all text
 * files in the sub-tree starting at the given path that contain
 * the given word.
 * You can assume that "word" contains no spaces or newlines.
 * "path" may be a file, in which case it is the only file
 * that is checked.
 * You can assume that the name of a text file always ends with
 * ".txt", and that all such files in our tests can be
 * suitably read by the method demonstrated in the workshops.
 * However, the directory structure may also contain other files,
 * which cannot be read in this way. Those files, which must be
 * skipped, can be distinguished from text files by having name
 * that ends differently than with ".txt".
 */
static TextFile[] getFilesContaining(File path, String word)

/**
 * If given an argument, that argument represents a word.
 * The program should search the current directory in which
 * the program is executed for text files (name ending with ".txt")
 * that contain that word, and print a list of the absolute
 * paths of those files (see the File class's "getAbsolutePath"
 * method), one per line, in alphabetic order (i.e. use String's
 * compare method to compare the absolute paths).
 */
public static void main(String[] args)

Follow the Design Recipe!

Part 4 (Distinction-Level Part)#

In a file called src/Matcher.java

Create a class called Matcher.

You must design at least the following members, as these are the ones that we will subject to automatic testing:

// To distinguish from array syntax, we will will write
// |M| for a data type that you should design here, instead of the
// usual [M]

// Here, File refers to java.io.File.
// Make sure to use to right import!

/**
 * Given a sentence (that is, a String containing some number of
 * words separated by spaces and newlines), and a path, represented
 * by a File object, return an array of matches (i.e., objects of
 * type |M|) of that sentence in the contents of any text files
 * in the sub-tree rooted at the given path. "path" may be a file,
 * in which case it is the only file that is checked.
 * You can assume that the name of a text file always ends with
 * ".txt", and that all such files in our tests can
 * be suitably read by the method demonstrated in the workshops.
 * However, the directory structure may also contain other files,
 * which cannot be read in this way. Those files, which must be
 * skipped, can be distinguished from text files by having a
 * name that ends differently than with ".txt". For an example
 * description, see further below.
 */
static |M|[] getMatches(File path, String sentence)

/**
 * Given a single command-line argument representing
 * a sentence in the definition of getMatches above,
 * prints on screen the matches as given by getMatches
 * for the directory from which the program is executed,
 * in the following format:
 * In file [absolute-file-path], at line [start-line]:
 * [matching lines]...
 * [one empty line]
 *
 * The order of the matches should be in alphabetic order of
 * absolute paths (i.e. use String's compare method to compare
 * absolute paths), and for multiple matches in the same file,
 * by the starting line index of each match (lower starting indices
 * come first).
 */
public static void main(String[] args)

|M| is a data type representing search matches, in its own file |M|.java (do not put the | character in a file name or class name!). It should have at least the following members (provided that they do not contain any non-trivial code, you do not need to write tests for these methods):

/**
 * The file in which the match was found.
 */
File getFile()
/**
 * The 0-based index of the first line contained in the match, out
 * of all lines in the file.
 */
int getStartLine()
/**
 * The full lines that contain the match.
 * The first entry of the array must contain a line with
 * some non-zero number of words from the start of the searched-for
 * sentence. The last entry of the array must contain a line with
 * some non-zero number of words from the end of the searched-for
 * sentence.
 * Any interior lines must fully match corresponding middle
 * parts of the searched-for sentence.
 */
String[] getMatchingLines()

Follow the Design Recipe!

As an example, consider the following file:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis
porttitor ultrices metus, vitae ultrices tortor iaculis et.
Vestibulum porta tortor id tincidunt accumsan. Morbi vulputate
placerat nunc, ac dignissim ante convallis ac. Phasellus commodo
quam vitae dignissim quis tempor erat maximus. Mauris sit
amet nisi ligula. Duis hendrerit vulputate ullamcorper. Cras
condimentum rutrum mauris vitae tristique. Vivamus viverra,
nisi sed placerat sagittis, libero enim convallis tellus, quam
vitae dignissim libero sem at nunc. Vivamus tincidunt mauris
nec ante euismod, vitae molestie quam eleifend.

Searching for “Duis vulputate” should not return a match, because while both words exist in the file, they are not next to each other in that order. Searching for “quam vitae dignissim” should return two matches, containing the following lines:

  • (line index 4):
    quam vitae dignissim, quis tempor erat maximus. Mauris sit

  • (line index 7):
    nisi sed placerat sagittis, libero enim convallis tellus, quam
    vitae dignissim libero sem at nunc. Vivamus tincidunt mauris

For the first match, the output of getMatchingLines() would be ["quam vitae dignissim quis tempor erat maximus. Mauris sit"], while for the second match, it would be ["nisi sed placerat sagittis, libero enim convallis tellus, quam", "vitae dignissim libero sem at nunc. Vivamus tincidunt mauris"].

Searching for “Vivamus viverra, nisi sed placerat sagittis, libero enim convallis tellus, quam vitae dignissim libero sem at nunc.” should return one match:

  • (line index 6):
    condimentum rutrum mauris vitae tristique. Vivamus viverra,
    nisi sed placerat sagittis, libero enim convallis tellus, quam
    vitae dignissim libero sem at nunc. Vivamus tincidunt mauris

For this match, the output of getMatchingLines() would be ["condimentum rutrum mauris vitae tristique. Vivamus viverra,","nisi sed placerat sagittis, libero enim convallis tellus, quam","vitae dignissim libero sem at nunc. Vivamus tincidunt mauris"].

Updates#

Friday, 04 April 2025, 10:50Clarification on the path for the main method in Part 4, added link to the javadoc for the File class, and clarified that everything is just either a directory or regular file.
Friday, 04 April 2025, 19:30Added option to ignore all files and directories whose name starts with '.' in Part 1
Tuesday, 08 April 2025, 10:10Clarified the expected files in your solution
Friday, 11 April 2025, 07:30Added CI file
bars search caret-down plus minus arrow-right times