Why not a simple solution:
1. Programs should call close() on stdout and report errors.
2. It's the job of whomever creates the open file description to fsync() it afterwards if desired.
3. If somebody runs a file system or hardware that ignores fsync() or hides close() errors, it's their own fault.
If you `hello > out.txt`, then it's not `hello` that creates and opens `out.txt`; the calling shell does it. So if you use `>` redirection, you should fsync in the calling shell.
Is there a drawback to this approach?
> LLVM tools were made to close stdout [..] and it caused so many problems that this code was eventually reverted
It would be good to know what those problems were.
> Programs should call close() on stdout and report errors.
Programs have never called open() to obtain stdin, stdout and stderr. They are inherited from the shell. What would be a meaningful way to report errors if the basic output streams are unreliable? If close(stdout) fails, we would need to write to stderr. Then you will have exactly the same error handling issue with closing stderr.
It's a flaw in the design of Unix where polymorphic behaviour is achieved through file descriptors. Worse is better...
> It's a flaw in the design of Unix where polymorphic behaviour is achieved through file descriptors. Worse is better...
Looks to me it's a flaw on the signature of `write`. There should be a way to recover the status without changing the descriptor status, and there should be a way to ensure you get the final status, blocking if necessary.
This can even be fixed in a backwards compatible way, by creating a new pair of functions.
> Is there a drawback to this approach?
You mean, apart from no existing code working like that? It is not possible for process that creates descriptor to fsync it, because in many very important cases that descriptor outlives the process.
What do you propose should "exec cat a.txt > b.txt" shell command do?
> no existing code working like that
That doesn't really matter for discussing how correct code _should_ be written.
Also, a good amount of existing code works like that. For example, if you `with open(..) as f:` a file in Python and pass it as an FD to a `subprocess` call, you can fsync and close it fine afterwards, and Python code bases that care about durability and correct error reporting do that.
> What do you propose should "exec cat a.txt > b.txt" shell command do?
That code would be wrong according to my proposed approach of who should be responsible for what (which is what the blog post discusses).
If you create the `b.txt` FD and you want it fsync'ed, then you can't `exec`.
It's equivalent to "if you call malloc(), you should call free()" -- you shouldn't demand that functions you invoke will call free() on your pointer. Same for open files.
> you can fsync and close it fine afterwards
No you cannot. Once you pass descriptor to another process, that process can pass it to yet another process, fork and detach, send it via SCM_RIGHTS, give "/proc/PID/fd/N" path to something etc.
Never assume descriptor cleanup will happen, unless you have complete control over everything.
I don't understand your point.
If you construct scenarios where the subprocess you call daemonizes and outlives your program, then there of course there isn't any convention your code should follow because your code isn't in charge -- it could follow whatever logic and it wouldn't matter. So then there's no possibly correct solution /anyway/.
The question of the original post is "What convention should programmers use for fsyncing files in standard scenarios?", for example, "Should cat fsync?". As the post says: "Who should be responsible?"
I'm suggesting an answer to that.
I don't understand the point of "but what if `cat` double-forks". It doesn't, and surely, if you're calling a program that daemonizes, you know that it does, and that the rules about who needs to fsync file descriptors will necessary change then.
> That doesn't really matter for discussing how correct code _should_ be written.
It absolutely does when you're talking about the semantics of virtually every program on earth
> It's equivalent to "if you call malloc(), you should call free()" -- you shouldn't demand that functions you invoke will call free() on your pointer. Same for open files.
There are many cases where the one calling malloc cannot be the one calling free and must explicitly document to callers/callees who is responsible for memory deallocation. This is a good example of where no convention exists and it's contextual.
But open files aren't memory and one cannot rely on file descriptors being closed without errors in practice, so people don't, and you can't just repave decades of infrastructure for no benefit out of ideological purity.
> There are many cases where the one calling malloc cannot be the one calling free and must explicitly document
That's fine. Special cases, documented deviation from the default convention.
> one cannot rely on file descriptors being closed without errors in practice, so people don't
You mean "so people don't call close(), and the error gets swallowed" (like the article points out for `cat`)? How's that good? Why is improving that "no benefit"?
> you can't just repave decades of infrastructure
Of course you can.
There are also lots of projects that were written without checking the return value of malloc() and then crashing. People make PRs and those get fixed.
Similarly people can come to the conclusion that LLVM and cat should call close() to not swallow errors, and then it will be done.
I fully agree.
The blog post is essentially a long winded way of saying that there isn't a compatible way to safely call `close` given all programs ever written. Yet, I think we already knew that.
It would be good to know what those problems were.
Idk which problems LLVM had, but closing stdout(stderr) long before exiting may make next open() to return 1(2) and voila some stray printf() now writes right into your database.
If you have to close std*, at least dup2() null device into it, that was a common advice.
Recent related thread about the interactions with finalizers:
It seems to me that the reference to NFS is a red herring. If I disconnect a disk after a program has terminated and before the OS has completed write back, the data is lost, no matter what close has returned. This was well known when removable writable media was more common. Still possible with hot pluggable disks.
If you want to make sure that the data is on stable storage you need fsync (and even then there is no guarantee that the disk will not just die or corrupt data).
If you are writing to a generic FS and don't know if what's on the other side is even a file, then it is not your responsibility of guaranteeing persistence. Some higher level component (for example the script invoking your program) will implement the transactional behaviour.
Remember that your program can always be kill-9'd: you can never guarantee that your output to a pipe is consistent.
Meta: I don't think "close" should be title-cased here, and indeed it isn't in the article. Also, I think it should be `close()`, with parens, that is the conventional way.
There are, I think, two separate but related issues here:
The first is that file I/O is essentially broken with respect to error reporting. A write is simultaneously treated as a synchronous operation that will wait until it succeeds or fails by most application code, and yet in the kernel, it's a request that may eventually fail. And those errors can only be reported in circumstances where doing anything about the error tends to be impossible. Worse yet, there's a tendency to even throw away these errors on the kernel side, so you have to be really, really diligent to make sure you catch them (this prompted the Postgresql folks to rage at the Linux kernel for a while over the braindead behavior of fsync).
At the end of the day, the issue is that most clients would probably be happy with the notion of I/O being asynchronous and eventually succeeding, so long as a) I/O happens-before edges could be specified and b) there were a delayed success/failure callback that was actually truthful, and not an endless game of people lying about it to win on benchmarks to cause you to go the next level of "no really, this is the real final callback mechanism".
The other issue is that there are just some cases where error handling just ... doesn't make sense. And errors for a basic print to stdout or stderr are quite frankly in that boat. If printing a message to stdout failed, what action are going to be able to do? What action is even reasonable? (Note that this analysis isn't the same for programs that are effectively meant to be part of a Unix shell pipeline, processing stdin into stdout, this is more for bare printf, rather than writing to a stream that just happens to be stdout instead of a file.)
Another reason failing to check errors from "close" is in practice less of a big deal than some people seem to think is that if close fails, there's often nothing to be done about it. If it fails the odds that just retrying is going to work are pretty low. If it's not interactive there may not even be a human to notify. I've actually been trying to be better about this but so often the only practical difference is that hopefully a log line comes out and the reason the close failed won't also have trashed the log itself. And it becomes very hard to test because close doesn't actually fail all that often. If close failures are 0.000001% of my theoretical failure space I'm much more worried about the 5% cases.
I'm not saying to casually not care, exactly, I'm just saying that there are some solid reasons why very few programs "correctly" handle this, yet largely, life has gone on and compared to, say, failures to correctly encode output or failures to enforce permissions this is noise.
After close() was called on a file descriptor, that descriptor is gone as long as close() returns something other than EBADF. So if what you've closed is not a "regular" file, you can't even re-open it to retry; and what writes would you even have to retry anyhow? Not to mention about the position to seek to.
What happens if I pipe the output to a file? Any fsync being called then?
up to process doing the piping. Stdout file descriptor can be a normal file, network socket, device, exotic matter (memfd, epoll, netlink, ...) Article only touched the "normal file" case (nfs, cifs, fuse, ...)
I request another article from the author, about "Detecting writes to /dev/null". In this case, even close() won't return an error yet the data your application has produced will be gone! Forever (a very long time)! Oh the horrors. Should we, the program authors, detect it as well? Probably not, but let's ponder it for another 10k symbols.
Or how about our users using, not a NFS, but one of those fishy Chinese flash drives that cost $0.50 but claim to have 128 TiB of storage? So many things that we can check for, as the application developers.
This is a very well argumented and presented article, that agrees with your sentiment and ends with:
"What I can say at this point is that I personally am not going to embark on a quest to get application programmers to check for errors from close.
And so, my conclusion here is that, no, our hello world program does not have a bug. It's fine. It's just fine."
I think many programmers have felt the exasperation you so well expressed in your comment.