CS 314 Specification 3 - Using Multiple Classes - Baby Names

Programming Assignment 3: Individual Assignment. You must complete this assignment on your own. You may not acquire from any source (e.g. another student, an internet site, generative AI / chatbot / LLM such as chatGPT) a partial or complete solution to a problem or project that has been assigned. You may not show another student your solution to the assignment. You may not have another person (current student, former student, tutor, friend, anyone) “walk you through” how to solve an assignment. You may get help from the instructional staff. You may discuss general ideas and approaches with other students but you may not develop or debug code together. Review the class policy on collaboration from the syllabus.

The purposes of this assignment are

  1. to practice using Java ArrayLists
  2. to implement a program that uses multiple classes. (object based programming, encapsulation)
  3. to practice implementing individual classes based on a given specification

CAUTION: Take care to ensure your IDE does not add any non standard Java imports for other Java classes called Name or Names. In the past students submitted code with bad imports and lost all correctness points due to compile errors. Compile your .java files on the lab machines from the command line if you want to be sure you do not have any non standard Java imports. 

Files:

  File Responsibility
Source Code NameSurfer.java Provided by me and you
Source Code NameRecord.java Provided by you.
Source Code Names.java Provided by me and you.
data files

names.txt (The primary data file.) names2.txt (Another data file with a different starting decade and a different number of ranks per name.) names4.txt (With the ranks from the decade of 2010 added!)

Note, the data files we use will always have a single blank line at the end. Meaning the last line with data shall end with a newline, just like all the other lines. In the past students copy and pasted the files from webpages which sometimes removes the final blank line. They created a solution that worked on these incorrect data files, but failed all tests when grading with the proper data files. If you open the file in a text editor the last line of names.txt should look like this image.

Provided by me.
sample run log nameSurferLog.txt Your output should roughly match this example. We won't be grading on the output of NameSurfer. Provided by me.
Submission NameSurfer.java, NameRecord.java, and Names.java to Assignment 3 on Gradescope. Provided by you

Description: This assignment is based on an assignment created by Nick Parlante and Stuart Reges' version of that assignment..

For this assignment you may not share your tests, but you CAN share your results when using different data files such as names2.txt and / or names4.txt

Complete a program that allows a user to query a data base of the 1000 most popular baby names in the United States per decade for a given number of decades under the constraints of the General Assignment Requirements and as described below.

As always, you may add helper methods and should do so to provide structure to the program and reduce redundancy.

One additional constraint: You must use the ArrayList class in your solution as discussed below.

Your program processes a file with data obtained from the Social Security Administration.  SSA has a web site showing the distribution of names chosen for children over the last 100 years in the US (https://www.ssa.gov/OACT/babynames/decades/).

Nick Parlante got the idea for this assignment from an article by Peggy Orenstein of the New York Times, Where have all the Lisas Gone?

Recently there was an article on predicting a someone's age based on their  name.

And some names that surge in popularity in a given decade turn out to be very old names. For instance, Tiffany. (Link to a CGP Grey video on YouTube that covers the history of the name Tiffany.)

The first example data file you are given is based on the 1000 most popular male and female names for kids born in the US going back to 1900.

The first two lines of the file are the base decade and the number of decades. The base decade indicates the decade for the first rank. The second line is an integer than indicates the  number of decades each name is ranked. This value shall be greater than or equal to 2. The given file starts

        1900
        11


The rest of the data file are names and there ranks. There is no indication in the data I give you whether a name is a female or male name. On each line there is  a name, followed by the rank of that name in the decades 1900, 1910, 1920, ..., 2000 (11 numbers), for the example file. Your program must handle different starting decades and number of ranks correctly.

Of course the starting decade and number of decades can vary. Look at the other example data files, names2.txt and names4.txt.

If a line does not have the correct number of ranks based on the the integer at the top of the file, your program shall ignore that line. (In other words do NOT include a NameRecord for this data in the Names object for lines in the data file without the correct number of ranks.)

Likewise, if a name has all zeros for ranks then do NOT  include a NameRecords object for that line in the Names object.

You may assume a given String never appears more than once as a name in a file. You may assume each line starts with a token representing a name and any tokens that follow this can be read as non negative ints in the range [0, 1000].

A rank of 1 indicates the name was the most popular name that year, while a rank of 997 was not very popular. A 0 in the data file means the name did not appear in the top 1000 that decade. 

...
Sam 58 69 99 131 168 236 278 380 467 408 466
Samantha 0 0 0 0 0 0 272 107 26 5 7
Samara 0 0 0 0 0 0 0 0 0 0 886

Samir
0 0 0 0 0 0 0 0 920 0 798
Sammie 537 545 351 325 333 396 565 772 930 0 0
Sammy 0 887 544 299 202 262 321 395 575 639 755
Samson 0 0 0 0 0 0 0 0 0 0 915
Samuel 31 41 46 60 61 71 83 61 52 35 28
Sandi 0 0 0 0 704 864 621 695 0 0 0
Sandra 0 942 606 50 6 12 11 39 94 168 257
...

Note,  a 0 in the data file means the name was NOT RANKED  in the top 1000 during the corresponding decade. It has some unknown rank greater than 1000. When you store a 0 from the data file in your NameRecord objects you may use something other than 0 if you think it will make your algorithms easier to implement. This is a suggestion, not a requirement. There are some trade offs involved. (Recall altering the way data is stored to fit our needs is part of the power of encapsulation.)

Also note the data does not indicate if a name was a girl's name or a boy's name. If a name appeared on the list of boys and girls names for a given decade the rank closest to one was used.

We see that “Sam” was #58 in 1900 and is slowly moving down. “Samantha” popped on the scene in 1960 and is moving up strong to #7. “Samir” barely appears in 1980, but by 2000 is up to #798.

You are given one one partial class, NameSurfer.java. This is the main driver class. After creating the database of names encapsulated in a class called Names.java the program displays a menu and allows the user to make various queries of the database.

Important. Do not hard code or assume every file will start with 1900 and / or have 11 ranks per name. Your program must read those values from the file and work correctly. I recommend you create your own results with names2.txt and names4.txt, then post the results to Ed discussion to compare with your classmates.


Suggested steps for implementing the program.

0. These suggested steps describe implementing the requirements. You may add more helper methods  if you want. You may store data in a different form than the input file. For example, the files uses a 0 to indicate the name had a rank greater than 1000 for a given decade. You can store a value other than 0 if you want. (The value for "not ranked in a given decade" is on the other side of the wall of abstraction.)

1. Implement and test a class called NameRecord. Each NameRecord object stores the data for an individual name, including the name itself (a String), the base decade (decade of the first rank), and the rank of the name for each decade. The ranks for each decade must be stored in an ArrayList of Integers. (this is to give you practice using ArrayList.) The class must have the following properties:

After completing all these methods you shall thoroughly test the NameRecord class using individual lines from the names.txt file or with your own data. Include your testing code in a method in your NameSurfer class even though it will not be called when the program is run. Part of the assignment grade will be based on the tests you write for the NameRecord class. For this assignment you may not share tests.

2. Implement and test the Names class. This class stores all of the NameRecord objects in an ArrayList. (private ArrayList<NameRecord> names) This class must have the following methods. All methods that return an ArrayList of NameRecords or an  ArrayList of  Strings must be in sorted ascending order based on the names. Do not change the method headers provided in the Names.java file.

3. Complete the methods in the NameSurfer class and add your method the missing  menu option, your interesting search or processing.

The menu choices must be:

1 to search for names.
2 to display data for one name.
3 to display all names that appear in only one decade.
4 to display all names that appear in all decades.
5 to display all names that are more popular in every decade.
6 to display all names that are less popular in every decade.
7 to perform the method of your own design from your Names class
8 to quit

For expected program behavior review the sample run log. Your output should match that of the sample run log for the various operations, except that you will not trim the output for operations that results in a large number of names and your output shall include the description of the search you developed. Also, we will not grade based on the output of NameSurfer, rather we will call methods from your Names class and check those results. So there can be minor differences in the output of your NameSurfer and what is shown in the log.

4. Neat searches. This is an interesting application because when your program is finished you can investigate various trends in naming children. In a comment at the top of the file discuss one interesting trend you found and back it up with data / results. Here are some examples. (You may not use these as your interesting trend. Do not share your interesting trend on the discussion group.)


Submission: Fill in the header for NameSurfer.java and copy it into NameRecord.java and Names.java. Replace <NAME> with your name. Note, you are stating, on your honor, that you did the assignment on your own. Submit NameSurfer.java, NameRecord.java, and Names.java to assignment 3 on Gradescope.


Checklist: Did you remember to:

  1. review and follow the general assignment requirements including the program hygiene requirements?
  2. ensure you classes are part of the default package? (in other words, no package statements in your source code)
  3. ensure you do not have any extraneous imports for Name or Names classes?
  4. work on the assignment individually? This includes making your own tests (this assignment only)
  5. fill in the header in NameSurfer.java and Names.java and copy it to your NameRecord.java file?
  6. complete the NameRecord class?
  7. include your tests of the NameRecord class in the NameSurfer class? Do not share your tests on this assignment. You can share the results of calls to the Names method with data files other than names.txt on Piazza.
  8. complete the Names class with the required methods?
  9. complete the NameSurfer class?
  10. add your own menu option for an interesting search and document it at the top the NameSurfer class?
  11. use an ArrayList of Integers to store the ranks in the NameRecord class and an ArrayList of NameRecords in the Names class to store all the NameRecords?
  12. comment on an interesting trend or pattern you found using your completed program at the top the NameSurfer class?
  13. turn in your files (NameSurfer.java, Names.java, and NameRecord.java) to Assignment 3 on Gradescope by Thursday, September 19?