Discussion:
Seeking way past end of file
(too old to reply)
Daniel Ramsbrock
2005-11-23 22:00:02 UTC
Permalink
What should happen in the following four scenarios?

1. An empty file is opened, and a seek to position 20,000 occurs,
followed by a WRITE request for 50 bytes.
2. An empty file is opened, and a seek to position 20,000 occurs,
followed by a READ request for 50 bytes.
3. A file of length 400 is opened, and a seek to position 20,400 occurs,
followed by a WRITE request for 50 bytes.
4. A file of length 400 is opened, and a seek to position 20,400 occurs,
followed by a READ request for 50 bytes.

I'm thinking that 1 and 3 really boil down to the same principle
(accessing past EOF, regardless of file size), as do 2 and 4, but I
wanted to make sure there is no distinction.

I'm guessing that in 1 and 3, we should at least allocate the block
where the write is happening. What about the blocks before it? Do we
leave them deallocated or do we allocate them and zero them out? If we
leave them deallocated, what happens later when there is a read request
for one of those blocks? What about a write request?

My guess is that for 2 and 4, we should return an error (since trying to
read information past the end of the file seems like an illegal operation).

What about the case where the read/write request is longer than 4096
bytes (one GOSFS block)?

Thanks, and happy Thanksgiving to everyone,

Daniel
Iulian Neamtiu
2005-11-24 02:47:40 UTC
Permalink
Post by Daniel Ramsbrock
What should happen in the following four scenarios?
1. An empty file is opened, and a seek to position 20,000 occurs,
followed by a WRITE request for 50 bytes.
The Write () succeeds, return 50.
Post by Daniel Ramsbrock
2. An empty file is opened, and a seek to position 20,000 occurs,
followed by a READ request for 50 bytes.
The Read() succeeds, return 0. Note that 0 is not an error. It means
you're at end of file.
Post by Daniel Ramsbrock
3. A file of length 400 is opened, and a seek to position 20,400 occurs,
followed by a WRITE request for 50 bytes.
4. A file of length 400 is opened, and a seek to position 20,400 occurs,
followed by a READ request for 50 bytes.
3 is like 1 and 4 is like 2 above.
Post by Daniel Ramsbrock
I'm thinking that 1 and 3 really boil down to the same principle
(accessing past EOF, regardless of file size), as do 2 and 4, but I
wanted to make sure there is no distinction.
I'm guessing that in 1 and 3, we should at least allocate the block
where the write is happening. What about the blocks before it? Do we
leave them deallocated or do we allocate them and zero them out? If we
leave them deallocated, what happens later when there is a read request
for one of those blocks? What about a write request?
You only allocate blocks when you need them. In the case 2 above,
you allocate block # 4 (20000/4096) and write 50 bytes at offsets
3616...3665 (20000 % 4096)

For case 4, you allocate block 4 and write at offsets 4016...4066

If there's a write to unallocated blocks, you allocate as needed.

Note that in case 4 above, if the user requested a write(100), you would
have to allocate block #5 as well, and split the write into
#4 4016..4095 (80 bytes)
#5 0..19 (20 bytes)

If there's a read from unallocated blocks, you must honor the request; if
the user requests X bytes and this was between 0 and EOF, return X and
leave the buffer as is. However, if N out of those X bytes are from
allocated blocks, copy the N bytes from the right file offset to the right
buf positions.
Post by Daniel Ramsbrock
My guess is that for 2 and 4, we should return an error (since trying to
read information past the end of the file seems like an illegal operation).
What about the case where the read/write request is longer than 4096
bytes (one GOSFS block)?
Well, don't worry about that, we won't test reads/writes with size more
than 1000, so you can directly return an error if size is > 4096.
However, your reads/writes might cross block boundaries even for
sizes as small as 2, e.g. file pos = 4095 and you do a Read or Write with
size 2.

Iulian

Loading...