Laboratory Assignment 1 CS 340d Unique Number: 50960 Spring, 2019 Given: February 18, 2019 Due: March 11, 2019 This laboratory concerns performing set operations on one- and two-dimensional data sets. This will require students to think carefully about maintaining invariants for the representation of the sets that will assure unique representation. General Comment Before we describe this laboratory assignment, we describe our philosophy for our all laboratory assignments and for many of our homework assignments. We expect our programs to implement their requirements with mathematical precision, but programs are generally specified with natural language. To this point in your education, most programming assignments have included some description of what program you should write, and then, you are expected to interpret the documentation and produce a result. It requires tremendous care and precision to write a precise description of any computation in a natural language -- it is certainly beyond our ability to write completely precise, natural-language specifications. We would like to write mathematical specifications, but that would require us learn mathematics for most of semester. As a community of software developers, this approach would be extremely valuable where it can be deployed, but it is not yet a mature discipline. Even so, we will sometimes refer to programs that can be specified formally. And, time permitting, we may redo this laboratory with better technology later this semester. Laboratory Requirements This laboratory involves implementing set negation, intersection, and union on the one-dimensional number line and the two-dimensional plane. Our one-dimensional number line will be limited to the values that can be represented by a 32-bit integer. Our two-dimensional plane will be limited by pairs of 32-bit integers. Consider the example number line below. -2^31 ... -1 0 1 2 3 4 5 6 7 8 9 ... 2^31-1 . ... . . . . . . . . . . . ... . Set A: o-----o o-o o---o Set B: o-----o . ... . . . . . . . . . . . ... . -2^31 ... -1 0 1 2 3 4 5 6 7 8 9 ... 2^31-1 Set A contains elements: {-1, 0, 1, 2, 4, 5, 7, 8, 9}. Set B contains elements {1, 2, 3, 4}. We want a representation for such sets. And, given one such set, we also want to be able to compute its negation (``~''). And, given two sets, we want to compute the intersection (``^'') and union (``v''). Given Set A and Set B, we show the coverage of the number line under the three basic operations. -2^31 ... -1 0 1 2 3 4 5 6 7 8 9 ... 2^31-1 . ... . . . . . . . . . . . ... . Set A: o-----o o-o o---o Set B: o-----o ~ B: o--...----o o----------...--o A ^ B: o-o o A v B o-----------o o---o . ... . . . . . . . . . . . ... . -2^31 ... -1 0 1 2 3 4 5 6 7 8 9 ... 2^31-1 How can we represent these sets? We could enumerate them, but the Set ~B would require nearly 2^32 entries. We want a more compact form to represent these sets. We could specify a set as a list of non-overlapping, non-abutting number ranges. That is, we could represent Set A as {(-1 2) (4 5) (7 9)}, Set B as {(1 4)}, and Set ~B as {(-2147483648 0) (5 2147483647)}. For some sets, this kind of representation may be more compact than just listing the elements. Part A: Define a predicate that identifies a well-formed, single-dimensional set. The representation of such a set should be a linked list of pairs of numbers; that is, each member of the list will contain three entries: a left range endpoint, a right range endpoint, and a pointer to the next such structured object. The final entry in the list will be NULL. Be careful with your representation. Make sure that your specification (predicate) of a range is perfectly clear, and define a C-language predicate that inspects a range to test whether the range data is well formed. Hint: your well-formed, data-set predicate should insist that your set is ordered so as to require any set represented to be unique. Define the negation, intersection, and union operations. Each operation should take well-formed sets and produce a well-formed set as a result. In essence, these functions are performing number-line operations, and they will produce results that can be used to "color" the points on a number line that extends from -2147483648 to 2147483647, inclusive. Part B: Define a predicate that identifies a well-formed, two-dimensional set. The representation of such a set should be a linked list of two pairs of numbers; that is, each member of the list will contain five entries: a lower-left-hand corner (composed of two integers), an upper-right-hand corner (composed of two integers) and a pointer to the next such structured object. The final entry in the list will be NULL. Be very careful with your representation. Make sure that your specification (predicate) provides a unique representation for all possible areas that might be described. And, be sure to define a C-language predicate that inspects a set of rectangles to test whether the given boxes define a well formed set. Hint: be sure that your predicate recognizes only a single representation for each possible set. A well-formed set is a set of boxes where each box describes a non-empty area and with no overlaps between the boxes in the set. An obvious way to do this is to represent the area prescribed as a collection of 1x1 boxes. But, this is likely to be inefficient. Careful thought will need to be given to create a predicate that recognizes a well-formed set of rectangles that do not overlap. Another key issue is how one might create a canonical representation for two-dimensional sets. Part C: For our two-dimensional set representation, define the AND operation. Part D: For our two-dimensional set representation, define the NOT operation. Laboratory Documentation Finally, for the writing component, you need to include in your solution program a 120-line to 150-line description of your solutions to Parts A and B. This description should be included as a C-language comment that begins with a line containing only "/*" and ends with a line containing only " */", and written in the (approximate) format of a typical Linux manual entry. Remember, this class carries a writing flag, and this kind of summary will be required for all of the class laboratory assignments. Grading Your laboratory will be graded with the following weights: 30% - Correctly functioning Part A 40% - Correctly functioning Part B 30% - Written description of your solutions. Extra Credit 15% - Correctly functioning Part C 15% - Correctly functioning Part D (Note, this may be hard.) Be careful with what you write. We will grade the functioning of your program on sets with hundreds or even thousands of members. And we will read your documentation carefully, looking for problems (grammar, spelling, run-on sentences, tense agreement, manual entry formatting, etc.) -- errors will lower your grade.