Skip to main content
  1. Teaching/
  2. teaching/

·4 mins

Storing and Reading Data

In the last assignment, you created a simple 1D trajectory of a particle undergoing 1D diffusion. Sometimes, we may want to store the results for later analysis--for example, maybe we want to calculate lots of different statistics (mean distance from zero, mean squared distance from zero, how far it travels in a certain period, etc.), but we don't know which yet.

To do this, we usually write the data to a file. If we know how the data was written out, we can then read it back into Python for further analysis later.

Assignment

Modify your program from homework 3a to record the positions of the particle as it moves. You should do this by writing the position to a file in diffusion_trace. Write the positions as floating point numbers in a text format, one per line. For example, the first few lines of a file might look like this:

0.0000000000000000
0.3469847920605084
1.2621253254425107
1.716050985244122
1.0761938730404437
-0.2808828182713501

which corresponds to a system where the particle moves right for three steps, then left for two. Remember: you are recording the particle's position, not its movement, so the first entry in the file should always be 0.00000.

Of course, this means you will need to pass a filename to diffusion_trace by adding an argument to the function.

Write another function, read_trace_and_get_mean, which takes a filename. It should open the filename given by that function, and read in the data while computing the mean. Just like in the Minimax homework, you should not use lists for this! Have this function return the mean of the numbers in the file.

Finally, modify your main() function. It should now ask "Are we generating or analyzing data?". If the answer is "generating", ask for the force, diffusion speed, and number of steps as before. Also ask for a filename to store the results in. Generate the data as before and write it to the file.

If the answer is "analyzing", ask for the filename. Read the trace from the file and display the mean to the user. Remember that the file specified by the user may not exist--you should handle this case elegantly. (Hint: os.path.isfile() may prove useful here--you can import it with import os).

Examples

$ python Diffusion1D.py
Are we generating or analyzing data? generating
Please enter a force: -2.0
Please enter a diffusion speed: 0.5
Please enter a number of steps: 10
Please enter a filename to store the results in: data.txt
Simulation results written to data.txt
$ python Diffusion1D.py
Are we generating or analyzing data? analyzing
Please enter a filename: data.txt
The mean position is 0.05
$ python Diffusion1D.py
Are we generating or analyzing data? analyzing
Please enter a filename: non_existent_file.txt
The file does not exist.

Insight

Generate a few files with 1000, 10,000, and 100,000 steps (any value of force and diffusion speed will do). Then, analyze the files and get the file size for each (you don't have to do this in Python--you can use Explorer on Windows/Finder on MacOS if this is easier).

On average, how many bytes do you need to store a step? Remembering that a float is 8 bytes in memory, do you think we could do better?

Submission

Submit a single file named Diffusion1D.py on Canvas. This is the same name as the last assignment--that's fine! Your file needs to compile and run. It should also have a header with the following information (this goes in your source file, not in the program output):

# File: Diffusion1D.py
# Student: 
# Course: Intro to Programming
# 
# Date:
# Description of Program:
# 
# How many bytes (on average) do you need to store a step?
# Could we do better? Why or why not?

The description should be a short (1-3 sentence) description of what the program does. Do not describe how it's written!

Remember to handle nonsense inputs.