Laboratory Assignment 0 CS 340d Unique Number: 50960 Spring, 2019 Given: January 28, 2019 Due: February 11, 2019 This laboratory concerns duplicating much of the functionality of the Unix "wc" command. In addition, this laboratory concerns adding functionality beyond that of the standard "wc" command. General Comment Before we describe this laboratory assignment, we describe our philosophy for our all laboratory assignments and for many of our homework assignments. We expect our programs to implement their requirements with mathematical precision, but programs are generally specified with natural language. To this point in your education, most programming assignments have included some description of what program you should write, and then, you are expected to interpret the documentation and produce a result. It requires tremendous care and precision to write a precise description of any computation in a natural language -- it is certainly beyond our ability to write completely precise, natural-language specifications. We would like to write mathematical specifications, but that would require us learn mathematics for most of semester. As a community of software developers, this approach would be extremely valuable where it can be deployed, but it is not yet a mature discipline. Even so, we will sometimes refer to programs that can be specified formally. Laboratory Requirements This laboratory involves duplicating some of the functionality of the "wc" command. Your solution needs to work only on byte-oriented input streams or files. Note, your program should also work on binary files. You should name your "wc" command "mywc". Your "mywc" command should accept the command-line arguments "-c", "-l", "-w" as specified in the "wc" command, and also two extra flags, the "-C" (the uppercase "C") and the "-N" (the uppercase "N") options. The "-C" option should eliminate the remainder single-line C-language comment strings. Single-line, C-language comments start with the string "//" (two "/" characters) and extend to the end of the line; however, the end-of-line () characters should not be elided. So, really, the specification for "-C" is: sed 's://.*$::g' | wc Your "mywc" program must also respond to the "-N" flag by printing the number of decimal numbers in the input. What is a decimal number? It is any string of adjacent digits (characters "0", "1", ... "9"). Your "mywc" program only needs to accept input from STDIN and print to STDOUT. Thus, parsing filename (command-line) arguments is not required. Here, we include some typical text that might help. The number of characters returned should be equal to the length of the file or input. The number of lines should be equal to the number of line feed characters contained in the file. The number of words should be equal to the groups of characters separated by spaces, tabs, line feeds, and carriage returns. But, instead of reading this, you should read the "wc" manual entry; try "man wc" at the command-line prompt. But, the real specification is what the Linux version of "wc" does on the departmental Linux computers. Extra credit may be awarded if you find a discrepancy of some kind in the Linux, FreeBSD, or MacOS commands. What is a discrepancy? Absolutely any input file or stream that causes your "mywc" version to produce a correct result that is different than the UTCS Linux computers. Now, if you can argue to the class that even though your implementation is inconsistent with the UTCS Linux result, that your result is correct -- then you may have found a Linux bug! Bugs of this kind are always worth extra credit. Be careful. Think about how the various flags might interact. Should the "-C" take precedence over the "-N" flag? Or should it be the other way? You will need to make a decision -- defend your decision. For more "sed" details", see: http://www.grymoire.com/Unix/Sed.html and see: http://sed.sourceforge.net/sed1line.txt . There are many more documents describing the regular-expression language. Laboratory Documentation Finally, for the writing component, you need to include in your solution program a 60-line to 90-line description of your "mywc" command. This description should be included as a C-language comment that begins with a line containing only "/*" and ends with a line containing only " */", and written in the (approximate) format of a typical Linux manual entry. Remember, this class carries a writing flag, and this kind of summary will be required for all of the class laboratory assignments. Grading You laboratory will be graded as with the following weights: 70% - Functioning of your "wc" implementation as specified above 30% - Written description of your "wc" command. Be careful with what you write. We will be grading the functioning of your program on several hundred files. And, we will carefully read your documentation, looking for problems (grammar, spelling, run-on sentences, tense agreement, etc.) -- errors will lower your grade. Turn-in Prior to the due date, we will post submission instructions.