CS 314 - Specification 8 - Sets
Programming Assignment 8: Pair Assignment. You may work with one other person on this assignment using the pair programming technique. If you work with a partner, you must work with someone in the same section as you. You must have the same TA. You can complete the assignment on your own if you wish. If you use slip days, each partner must have the required number of slip days or the assignment is a 0.
You and your partner may not acquire from any source (e.g. another student, an internet site, a large language model or generative AI or coding assistant such as chatGPT) a partial or complete solution to a problem or project that has been assigned. You and your partner may not show other students your solution to an assignment. You may not have another person (current student other than your partner, former student, tutor, friend, anyone) “walk you through” how to solve an assignment. You may get help from the instructional staff. You may discuss general ideas and approaches with other students. Review the class policy on collaboration from the syllabus. If you took CS314 in a previous semester and worked with a partner on this assignment then, you must now start from scratch on the assignment. Likewise, if you took CS314 in a previous semester (regardless of if you worked by yourself or with a partner) and work with a partner this semester, you must start from scratch.
The purposes of this assignment are:
ArrayList
s
and Iterator
sSummary: Implement three classes: AbstractSet
,
UnsortedSet
, and SortedSet
. Recall that the
elements of an a set, typically, have no definite order and duplicate
items are not allowed.
Wikipedia entry on Sets.
For more details on how to complete the assignment see the description of classes and files below as well as the tips section.
You may adapt the code for
fast sorting (quicksort, mergesort) from the slides for use in your
SortedSet
class. Likewise you may adapt
the code for bianry search from the slides.
Restrictions and requirements:
ArrayList
as your internal storage container in the
UnsortedSet
and SortedSet
classes. Arrays
and Collections
classes and the public void sort(Comparator<? super E> c)
method from the ArrayList
class. You may use
the built in linear search methods such as the
ArrayList contains
and indexOf
methods. UnsortedSet
and
SortedSet
state the Big O of the method in a comment at the top of the
method. To answer some of these questions you have to determine the Big O of
methods from the ArrayList
class. View the
ArrayList documentation and think about how we implemented methods in
our simple array-based list class (GenericList) to determine the Big O of methods from
ArrayList
. Use Piazza to discuss what you think the
Big O of methods from ArrayList
are.Files
File | Responsibility | |
Source Code | ISet.java. Interface for the classes you are developing. Do not alter. | Provided by me |
Source Code | AbstractSet.java. Complete as many methods as possible in this class without any instance variables. | Provided by me and you. (Granted, mostly you.) |
Source Code | UnsortedSet.java A set with elements in no particular order. | Provided by me and you. (Granted, mostly you.) |
Source Code | SortedSet.java A set with elements in sorted ascending order. | Provided by me and you. (Granted, mostly you.) |
Testing | SetTester.java A class with various tests. Remove the provided tests and add your own tests as specified below. | Provided by me and you |
Utility class | Stopwatch.java. A class for calculating elapsed time when running other code. | Provided by me. |
Documentation |
ISet.html. Javadoc page for the
ISet interface. AbstractSet.html. Brief Javadoc page for the skeleton AbstractSet class. UnsortedSet.html. Brief Javadoc page for the skeleton UnsortedSet class. SortedSet.html. Brief Javadoc page for the skeleton SortedSet class. |
Provided by me |
Submission | Submit your version of these 4 files (AbstractSet.java, UnsortedSet.java, SortedSet.java, and SetTester.java) to GradeScope assignment 8. | Provided by you |
Description of Classes and Files:
ISet: The interface for set classes. Do not alter this file.
AbstractSet: Complete a skeletal implementation of the Set interface in the AbstractSet class.
AbstractSet
shall not make any
explicit references to SortedSet
, UnsortedSet
, or
ArrayList
. (You will be making implicit references to call some
methods such as iterator. If the implicit calls weren't
allowed you couldn't get anything done in AbstractSet
.)AbstractSet
can and should use the iterator
method and
Iterator
objects. Assume the Iterator
class implements the remove
method. iterator
. For
example cointainsAll
could make use of contains
.
.AbstractSet
. Part of this assignment is figuring out which methods
can be implemented in AbstractSet
and which ones can't.
See the example below. intersection
, union
, or
difference
without adding any new methods to any classes or making
reference to UnsortedSet
or SortedSet
. You may
not use the Java Class
class, either. Explain
in the comment at the top of SetTester why it would be unwise to implement
all three of intersection
,
union
, or
difference
in the AbstractSet class.ISet
s are equal if they contain the same elements.
The status of sorted or unsorted does not matter when checking for equality
in AbstractSet
. This means it is possible that an UnsortedSet
will equal a
SortedSet
. This convention is
used in much of the Java standard library. (Try it with TreeSet
a kind of sorted set, and HashSet
, a kind of unsorted set.) You
shall implement the equals
method in the
AbstractSet
class and will override it in SortedSet
.toString
method as another example of
how to use an iterator
. You can use
this method to print out your set objects during testing. UnsortedSet: The provided source file
UnsortedSet.java is a skeleton file. This class extends
AbstractSet
. The elements in this set are not maintained in any
particular order from the client's point of view. ("A client" is any
other class that uses UnsortedSet
.) Complete this class.
ArrayList
<E> as the internal storage container. This class maintains the
elements of the set it represents in unsorted order. AbstractSet
if you can implement them more efficiently.
See the section
below on target Big O's for methods. AbstractSet
. ISet
such as union
, intersection
,
and difference
, return an instance of UnsortedSet
if the calling object is an UnsortedSet
. (Which it must be to
reach the code in UnsortedSet
.).SortedSet: The provided source file
SortedSet.java is a skeleton file. This class extends
AbstractSet
. The elements in this set are maintained in sorted order. To do
this the generic data type E (data type parameter) must be a class that
implements the Comparable
interface. The class header for
SortedSet
modifies E so that we know all variables of type E in
SortedSet
will be data types that implement Comparable
.
ArrayList<E>
as the internal
storage container. AbstractSet
if you can implement them more efficiently.
See the section below
for the target Big O's of methods. AbstractSet
. SortedSet
out of an
ISet<E>
.
SortedSet
.ISet
such
as union
, intersection
, and difference
,
return an instance of SortedSet
.SetTester: The provided source file
SetTester.java contains some tests for the set classes. Delete the provided tests in the version of SetTester your turn in. Add at least 1 test per method per class
that implements that method to this file. (In other words if there is a
method in AbstractSet
and you do not override it in UnsortedSet
or
SortedSet
,
you just need to write one test for that method. On the other hand if there
is a method in AbstractSet that you do override in both UnsortedSet and
SortedSet you have to write code for both versions.) Use the class
discussion group to share new test cases.
When writing code that performs sorts and searches you may use code from
the class slides as long as you document the source with
a comment. For example:
// code for binary search from class slides
... binary search code
You will not be able to use the sorts and searches from the slides "as is". You will have
to adapt them to you classes. Recall your internal storage containers are
ArrayLists
not arrays.
Experiments:
When your classes are completed, run the method largeTest
in the
SetTester
class. This method has you pick a file and then adds all of the words in the
file to various Sets. The method uses the SortedSet
, the
UnsortedSet
, the
Java HashSet
, and Java TreeSet
. The time is displayed for the operation to
execute. Test this method with a small file at first to ensure it works.
Then files (again Project
Gutenberg is a good source as is ) of various sizes. Use at least four
different files. Report your results in a
comment at the top of the SetTester class. Also answer the following questions
in that comment: Include the change in file size, number of words, number of
distinct words, and time. Express these as a factor (for example 1.5x, 3.7x) compared to the
previous file. (Follow the
format from the slides on Maps.)
Example showing first 3 rows for one of the four data structures. Include a table for each of the four data structures
Unsorted Set
File
Size (kb) Total Words Increase from Previous Row
Unique Words Inc. Prev. Row Actual Time Inc.
Prev. Row
Foo.txt 67
1218 -
271 -
0.052 sec. -
Bar.txt 151
2109 1.7x
493 1.8x
0.24 sec. 4.6x
Baz.txt
517 7927
3.8x
1702 3.5x
1.16 sec 4.8x
processText
methods are for
each kind of Set? Assume N = total number of words in a file and M =
number of distinct words in the file. M = the size of the set when
finished.add
methods? What do you think the Big O
of the HashSet
and TreeSet
add methods are?HashSet
and TreeSet
when printing out
the contents of the Set? Checklist: Did you remember to:
AbstractSet
? (Also no
methods not declared in the ISet interface and no references to UnsortedSet
, SortedSet
, or ArrayList
)sAbstractSet
regardless of efficiency?AbstractSet
in SortedSet
and
UnsortedSet
if you can create a more efficient version?UnsortedSet
and SortedSet
?These are the Big O's you should be able to achieve for the
methods in the UnsortedSet
class. If a method is
implemented in AbstractSet
and it has this Big O then there is
no need to override it in UnsortedSet
. Assume there are already
N elements in this UnsortedSet
. For methods with another set
assume there are N elements in that set as well.
ArrayList
and ArrayList
already has a method to
get an Iterator
. These are the Big O's you should be able to achieve for the
methods in the SortedSet
class. If a method is
implemented in AbstractSet
and it has this Big O then there is
no need to override it in SortedSet
. Assume there are already N
elements in this SortedSet
. For methods that involve two sets,
the calling object and another ISet
, these target Big Os apply only
if the other set is also a SortedSet
. if it is not your target Big
O is the same as in the UnsortedSet
class. If there is a method where the other set is not a
SortedSet
you should rely
on the Big O targets from UnsortedSet
.
SortedSet
, but the Big O of that statement is
O(N).SortedSet
and assuming the sets are equal.ArrayList
and ArrayList
already has a method to
get an Iterator
. Tips: So how can you implement methods in AbstractSet
when there isn't an internal storage container or other instance variables?
By using other methods in the class!
Here is an example. The ISet
interface has a
method named contains
that determines if a given element is in the set. There is
also a method named iterator
which provides an Iterator
object
to look at all the items in the set. So in AbstractSet
we could
do the following:
public boolean contains(E item) {
boolean found = false;
Iterator<E> it = this.iterator(); // get an iterator for this
set.
// do a linear search using the iterator object
while (!found && it.hasNext()) {
E temp = it.next();
found = temp.equals(item); // which
equals method is getting called?
}
return found;
}
You won't be able to implement the iterator
method in the
AbtsractSet
class. You need an "actual-factual" internal storage container to do
that. So you will be calling a method that will be implemented later. Note,
the use of a temporary object in the above code is actually unnecessary and
the code could be streamlined to this form.
public boolean contains(E item) {
And there is an even simpler option.
Iterator<E> it = this.iterator(); // get an iterator for this
set.
// do a linear search using the iterator object
while (it.hasNext()) {
if (it.next().equals(item)) {
return true;
// found it. Done!
}
}
return false; // never found item
}
ISet
extends
the Java Iterable
interface. This means it has a method that
returns an Iterator
object. It also means ISet
s can be
used as the set-expression in enhanced for loops. So the following works
as well
public boolean contains(E obj) {
for (E val : this) {
if (val.equals(obj)) {
return true;
}
}
return false;
}
Use whatever form you are most comfortable with.
SortedSet tips:
In the SortedSet
class when methods can be faster due to the fact that
the data is sorted, you should make them faster. For example consider the
contains
method:
public boolean contains(E item)
This could be implemented in the AbstractSet
class using an
Iterator
object from the iterator
method. The expected run time would be
O(N). However, in the SortedSet
class this method should be overridden,
because it can be accomplished in O(logN) time through a binary search.
There are many methods that have an explicit parameter of type ISet. If
that ISet is a SortedSet we can perform some actions more efficiently.
For example, the intersection
method in SortedSet
will look something like this:
public ISet<E> intersection(ISet<E> otherSet) {
// we know E is a Comparable, but we don't know if
// otherSet is a SortedSet.
// local var of correct type so we only have to cast once
SortedSet<E> otherSortedSet;
// do we need to create a new SortedSet based on otherSet or
is otherSet really a SortedSet?
if (!(otherSet instanceof SortedSet<?>)) {
otherSortedSet = new SortedSet<>(otherSet);
// should be O(NlogN)
} else {
otherSortedSet = (SortedSet<E>)
otherSet;
}
SortedSet<E> result = new SortedSet<E>();
// fill result with modified merge algorithm, should be O(N)
return result;
}