[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In the previous two assignments, you made extensive use of a
file system without actually worrying about how it was implemented
underneath. For this last assignment, you will improve the
implementation of the file system. You will be working primarily in
the filesys
directory.
You need to build project 4 on top of project 2.
Here are some files that are probably new to you. These are in the
filesys
directory except where indicated:
fsutil.c
filesys.h
filesys.c
directory.h
directory.c
inode.h
inode.c
file.h
file.c
lib/kernel/bitmap.h
lib/kernel/bitmap.c
Our file system has a Unix-like interface, so you may also wish to
read the Unix man pages for creat
, open
, close
,
read
, write
, lseek
, and unlink
. Our file
system has calls that are similar, but not identical, to these. The
file system translates these calls into disk operations.
All the basic functionality is there in the code above, so that the file system is usable from the start, as you've seen in the previous two projects. However, it has severe limitations which you will remove.
While most of your work will be in filesys
, you should be
prepared for interactions with all previous parts.
By now, you should be familiar with the basic process of running the Pintos tests. See section 1.2.1 Testing, for review, if necessary.
Until now, each test invoked Pintos just once. However, an important purpose of a file system is to ensure that data remains accessible from one boot to another. Thus, the tests that are part of the file system project invoke Pintos a second time. The second run combines all the files and directories in the file system into a single file, then copies that file out of the Pintos file system into the host (Unix) file system.
The grading scripts check the file system's correctness based on the
contents of the file copied out in the second run. This means
that your project will not pass any of the extended file system tests
until the file system is implemented well enough to support
tar
, the Pintos user program that produces the file that is
copied out. The tar
program is fairly demanding (it requires
both extensible file and subdirectory support), so this will take some
work. Until then, you can ignore errors from make check
regarding the extracted file system.
Incidentally, as you may have surmised, the file format used for copying
out the file system contents is the standard Unix "tar" format. You
can use the Unix tar
program to examine them. The tar file
for test t is named t.tar
.
Before you begin this portion of the project, test your knowledge of the code by answering the Code Reading Questions. Being able to answer those questions before you begin coding will significantly help you.
To make your job easier, we suggest implementing the parts of this project in the following order:
You can implement extensible files and subdirectories in parallel if you temporarily make the number of entries in new directories fixed.
You should think about synchronization throughout.
In addition to submitting your source code, your group is
responsible for
answering the questions in the
project 4 design document template and submitting the completed file through
Canvas to the Project 4 Design and Documentation
assignment.
The purpose of the design document is to explain and defend your design to us. Its grade will reflect both your answers to the questions and the correctness and completeness of the implementation of your design. It is possible to receive partial credit for speculating on the design of portions you do not implement, but your grade will be reduced due to the lack of implementation.
We recommend that you read the design document template before you start working on the project. See section D. Project Documentation, for a sample design document that goes along with a fictitious project.
The basic file system allocates files as a single extent, making it vulnerable to external fragmentation, that is, it is possible that an n-block file cannot be allocated even though n blocks are free. Eliminate this problem by modifying the on-disk inode structure. In practice, this probably means using an index structure with direct, indirect, and doubly indirect blocks. You are welcome to choose a different scheme as long as you explain the rationale for it in your design documentation, and as long as it does not suffer from external fragmentation (as does the extent-based file system we provide).
You can assume that the file system partition will not be larger than 8 MB. You must support files as large as the partition (minus metadata). Each inode is stored in one disk sector, limiting the number of block pointers that it can contain.
An extent-based file can only grow if it is followed by empty space, but indexed inodes make file growth possible whenever free space is available. Implement file growth. In the basic file system, the file size is specified when the file is created. In most modern file systems, a file is initially created with size 0 and is then expanded every time a write is made off the end of the file. Your file system must allow this, but still allow files to be created with an initial size.
There should be no predetermined limit on the size of a file, except that a file cannot exceed the size of the file system (minus metadata). This also applies to the root directory file, which should now be allowed to expand beyond its initial limit of 16 files.
User programs are allowed to seek beyond the current end-of-file (EOF). The seek itself does not extend the file. Writing at a position past EOF extends the file to the position being written, and any gap between the previous EOF and the start of the write must be filled with zeros. A read starting from a position past EOF returns no bytes.
Writing far beyond EOF can cause many blocks to be entirely zero. Some file systems allocate and write real data blocks for these implicitly zeroed blocks. Other file systems do not allocate these blocks at all until they are explicitly written. The latter file systems are said to support "sparse files." You may adopt either allocation strategy in your file system.
Implement a hierarchical name space. In the basic file system, all files live in a single directory. Modify this to allow directory entries to point to files or to other directories.
Make sure that directories can expand beyond their original size just as any other file can.
The basic file system has a 14-character limit on file names. You may retain this limit for individual file name components, or may extend it, at your option. You must allow full path names to be much longer than 14 characters.
Maintain a separate current directory for each process. At
startup, set the root as the initial process's current directory.
When one process starts another with the exec
system call, the
child process inherits its parent's current directory. After that, the
two processes' current directories are independent, so that either
changing its own current directory has no effect on the other.
(This is why, under Unix, the cd
command is a shell built-in,
not an external program.)
Update the existing system calls so that, anywhere a file name is
provided by the caller, an absolute or relative path name may used.
The directory separator character is forward slash (/
).
You must also support special file names .
and ..
, which
have the same meanings as they do in Unix.
Update the open
system call so that it can also open directories.
Of the existing system calls, only close
needs to accept a file
descriptor for a directory.
Update the remove
system call so that it can delete empty
directories (other than the root) in addition to regular files.
Directories may only be deleted if they do not contain any files or
subdirectories (other than .
and ..
). You may decide
whether to allow deletion of a directory that is open by a process or in
use as a process's current working directory. If it is allowed, then
attempts to open files (including .
and ..
) or create new
files in a deleted directory must be disallowed.
Implement the following new system calls:
mkdir("/a/b/c")
succeeds only if /a/balready exists and
/a/b/cdoes not.
READDIR_MAX_LEN + 1
bytes, and returns true. If no entries are left in the directory,
returns false.
.
and ..
should not be returned by readdir
.
If the directory changes while it is open, then it is acceptable for some entries not to be read at all or to be read multiple times. Otherwise, each directory entry should be read once, in any order.
READDIR_MAX_LEN
is defined in lib/user/syscall.h
. If your
file system supports longer file names than the basic file system, you
should increase this value from the default of 14.
An inode number persistently identifies a file or directory. It is unique during the file's existence. In Pintos, the sector number of the inode is suitable for use as an inode number.
We have provided ls
and mkdir
user programs, which
are straightforward once the above syscalls are implemented.
We have also provided pwd
, which is not so straightforward.
The shell
program implements cd
internally. The
programs are provided in the src/examples
directory.
The pintos
extract
and append
commands should now
accept full path names, assuming that the directories used in the
paths have already been created. This should not require any significant
extra effort on your part.
The provided file system requires external synchronization, that is, callers must ensure that only one thread can be running in the file system code at once. Your submission must adopt a finer-grained synchronization strategy that does not require external synchronization. To the extent possible, operations on independent entities should be independent, so that they do not need to wait on each other.
Multiple processes must be able to access a single file at once.
Multiple reads of a single file must be able to complete without
waiting for one another. When writing to a file does not extend the
file, multiple processes should also be able to write a single file at
once. A read of a file by one process when the file is being written by
another process is allowed to show that none, all, or part of the write
has completed. (However, after the write
system call returns to
its caller, all subsequent readers must see the change.) Similarly,
when two processes simultaneously write to the same part of a file,
their data may be interleaved.
On the other hand, extending a file and writing data into the new section must be atomic. Suppose processes A and B both have a given file open and both are positioned at end-of-file. If A reads and B writes the file at the same time, A may read all, part, or none of what B writes. However, A may not read data other than what B writes, e.g. if B's data is all nonzero bytes, A is not allowed to see any zeros.
Operations on different directories should take place concurrently. Operations on the same directory may wait for one another.
Keep in mind that only data shared by multiple threads needs to be
synchronized. In the base file system, struct file
and struct dir
are accessed only by a single thread.
Once you have selected a partner, exchange first and last names, EIDs, and CS logins. Also, fill out the README.filesys distributed with the project and register your group as an FS Group in Canvas by the date listed in the schedule. You must follow the pair programming guidelines set forth for this class. Use the provided programming_log.filesys as your pair programming log.
Please see the Grading Criteria to understand how failure to follow the pair programming guidelines OR fill out the README.filesys will affect your grade.
After you finish your code, please use make turnin_fs
(in the
src
directory) to create a tarball for submission. The filename format
will be filesys_turnin.tar.gz. Then, upload the .tar.gz file to the Project 4
Test Cases assignment in Canvas. Only one member of the group
should perform the upload. Make sure you have included the necessary information
in the README.filesys. Failure to do so will result in a loss of half of the
correctness grade for all group members.
Once your group has completed the design document, please submit it to the Project 4 Design and Documentation assignment in Canvas. Make sure you have included your names, UT EIDs, and other requested information in the design document.
Yes. You must work with 1 to 3 other people. Register your group as a Filesys Group in Canvas by the date listed in the schedule.
See See section 5.3.6 Turnin Instructions.
Here's a summary of our reference solution, produced by the
diffstat
program. The final row gives total lines inserted
and deleted; a changed line counts as both an insertion and a deletion.
This summary is relative to the Pintos base code, but the reference solution for project 4 is based on the reference solution to project 3. Thus, the reference solution runs with virtual memory enabled. See section 4.4 FAQ, for the summary of project 3.
The reference solution represents just one possible solution, includes a large number of comments, and also includes the code from previous projects. Many other solutions are also possible and many of those differ greatly from the reference solution. Some excellent solutions may not modify all the files modified by the reference solution, and some may modify files not modified by the reference solution.
Makefile.build | 5 devices/timer.c | 42 ++ filesys/Make.vars | 6 filesys/directory.c | 99 ++++- filesys/directory.h | 3 filesys/file.c | 4 filesys/filesys.c | 194 +++++++++- filesys/filesys.h | 5 filesys/free-map.c | 45 +- filesys/free-map.h | 4 filesys/fsutil.c | 8 filesys/inode.c | 444 ++++++++++++++++++----- filesys/inode.h | 11 threads/init.c | 5 threads/interrupt.c | 2 threads/thread.c | 32 + threads/thread.h | 38 +- userprog/exception.c | 12 userprog/pagedir.c | 10 userprog/process.c | 332 +++++++++++++---- userprog/syscall.c | 582 ++++++++++++++++++++++++++++++- userprog/syscall.h | 1 vm/frame.c | 161 ++++++++ vm/frame.h | 23 + vm/page.c | 297 +++++++++++++++ vm/page.h | 50 ++ vm/swap.c | 85 ++++ vm/swap.h | 11 28 files changed, 2228 insertions(+), 286 deletions(-) |
BLOCK_SECTOR_SIZE
change?
No, BLOCK_SECTOR_SIZE
is fixed at 512. For IDE disks, this
value is a fixed property of the hardware. Other disks do not
necessarily have a 512-byte sector, but for simplicity Pintos only
supports those that do.
The file system partition we create will be 8 MB or smaller. However, individual files will have to be smaller than the partition to accommodate the metadata. You'll need to consider this when deciding your inode organization.
a//bbe interpreted?
Multiple consecutive slashes are equivalent to a single slash, so this
file name is the same as a/b
.
/../x?
The root directory is its own parent, so it is equivalent to /x
.
/be treated?
Most Unix systems allow a slash at the end of the name for a directory, and reject other names that end in slashes. We will allow this behavior, as well as simply rejecting a name that ends in a slash.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |