CS303E Homework 2

Instructor: Dr. Bill Young
Due Date: Tuesday, January 28, 2025 at 11:59pm

Copyright © William D. Young. All rights reserved.

Prefixes for Large Quantities

In science and computing there are several common prefixes used when referring to large quantities. For example, the amount of information stored in a file may be measured in megabyes, using the prefix "mega" to mean a million bytes. But the term "megabyte" has historically been used with two distinct meaning. In decimal contexts it refers to 10**6 bytes. In binary usage, it refers to 2**20 bytes. Those numbers aren't the same. This has sometimes been a source of confusion.

Starting around 1998, the International Electrotechnical Commission (IEC) and several other standards and trade organizations addressed the ambiguity by publishing standards and recommendations for a set of binary prefixes that refer exclusively to powers of 2. This resolves the ambiguity: you can use "X megabytes" to refer to (X * 10**6) bytes, and "X mebibytes" to refer to (X * 2**20) bytes.

Decimal prefixes:

kilo 	10**3 	k 
mega 	10**6 	M
giga 	10**9 	G
tera 	10**12 	T
peta 	10**15 	P
exa 	10**18 	E
zetta 	10**21 	Z
yotta 	10**24 	Y

Binary prefixes:

kibi    2**10   Ki
mebi    2**20   Mi
gibi    2**30   Gi
tebi    2**40   Ti
pebi    2**50   Pi
exbi    2**60   Ei
zebi    2**70   Zi
yobi    2**80   Ki
However, these naming rules are usually honored in the breach; when you buy a computer with k "megabytes" of RAM, it's typically referring to a binary measurement rather than a decimal measurement. Luckily, the two don't differ by so much as to be confusing in many cases. It turns out that the binary member of each pair is always bigger than the decimal member.

Your Assignment

Your task is to investigate how much the decimal and binary prefixes vary and print the results in a nice tabular format. First print an empty line. Then print a table with a header and 8 lines of data as shown below. For each prefix pair, print the pair of prefixes and the percentage difference that the binary value differs from the decimal value. Your program must compute these percentages, not enter them as literal values. Below are sample lines of the table:

> python Prefixes.py 

Dec/Bin                 diff%
------------------------------
kilo/kibi               2.40%
mega/mebi               4.86%
    <5 missing lines>
yotta/yobi             20.89%
>
For the first entry, we got 2.4% since kilo = 1000 and kibi = 1024. The difference is 24 which is 2.4% of 1000, computed by (difference / kilo) * 100. Print the pair of prefixes in a string field of 20 and print the percentage difference in a float field of 5 with two decimals of precision, and percent sign. The percent signs must line up. The line is 26 dashes.

Update: That line is actually 30 dashes. Also, I used fields of 20 and 5, but you can play with those to get them right. We'll be a bit forgiving if the spacing doesn't exactly match the model. I'd also suggest printing the percent sign separately, rather than using "%" as part of your format specifier. Use sep = "" to suppress the extra spaces so that there's not a space before the percent sign. But that may also affect the spacing of the other fields.

BTW: this assignment would be difficult in many programming languages. Computing a number as big as 2**80 in C, for example, would not give the "correct" answer, at least not in standard C integer arithmetic. By default, Python uses "big number" arithmetic. So even computing 2**10000 won't cause Python to choke. Try it for yourself.

Note: This assignment really only requires assignments, print, simple arithmetic, and format statements. If you're tempted to use loops, lists, or anything we haven't covered, don't! Remember in every assignment that you should only use constructs we've covered to that point in the semester.

Turning in the Assignment:

The program should be in a file named Prefixes.py. Submit the file via Canvas before the deadline shown at the top of this page. Submit it to the assignment hw2 under the assignments sections by uploading your python file.

Your file must compile and run before submission. It must also contain a header with the following format:

# Assignment: HW2
# File: Prefixes.py
# Student: 
# UT EID:
# Course Name: CS303E
# 
# Date:
# Description of Program: 

If you submit multiple times to Canvas, it will rename your file name to something like Prefixes-1.py, Prefixes-2.py, etc. Don't worry about that; we'll grade the latest version.

Programming Tips:

Many assignments this semester will include this section called Programming Tips. In general, these are not hard requirements about this specific assignment; but they are useful suggestions to help you with the assignment or to become better programmers. So read and apply them!

Output format: As explained in slide set 1, there are multiple ways to run your program. If you run in interactive mode (in the Python loop), the system will automatically display the result of every command (unless the result is None). If that result is a string, it will display with string quotes ('answer'). If you're running the program in batch mode (from the command line, as e.g. python myProgram.py) or explicitly print a string, it won't appear with string quotes. If you run your program in batch mode the only things you'll see displayed are the things you explicitly print; it won't display the results of individual commands. The following is in interactive mode:

>>> string = "my string"
>>> string
'my string'
>>> print(string)
my string
>>> 
If your string output doesn't match what's shown in the assignment sample output because of the presence or absence of string quotes, that may be the reason. Usually, it's not an issue to worry about.

But remember, you must turn in a file that contains the code to do this computation. So it's not adequate to just run the steps in interactive mode. You can do that while you're debugging the steps; but you need a complete program stored in a file for your submission.

Format vs. round: A floating point number is stored in memory with a certain precision (usually 32 bits or 64 bits). For example math.pi is 3.141592653589793. Since pi is an irrational, this is still an approximation; there is no finite decimal expansion that represents pi exactly. So when you want to print pi (or any other float) you need to decide how much of the representation you want to retain. If you don't specify, you'll get a decimal approximation to as many significant digits as are stored. Trailing zeros are not printed unless you specifically ask for that to happen.

Python function format() is the way you tell Python how you'd like a number printed. It generates a string representation suitable for printing. Notice it doesn't change anything in memory or store a new number. If you want to see pi printed to 4 decimal places you might do:

>>> math.pi
3.141592653589793
>>> format(math.pi, ".4f")
'3.1416'
>>> math.pi                 # note formatting didn't change pi
3.141592653589793

Python function round() is a way of generating a new number that you could then store in memory. If you then print it, Python will not display trailing zeros. So round() is not a good way to get a certain precision printed.

>>> round(math.pi, 4)
3.1416
>>> math.pi                 # you didn't change pi
3.141592653589793
>>> x = 2.5002
>>> format(x, ".2f")
'2.50'
>>> y = round(x, 2)
>>> x
2.5002
>>> y                      # the rounded value you stored
2.5
Bottom line: for printing use format(). Only use round() if you really need a new number that is an approximation of the original, e.g., if you're doing limited-precision arithmetic.