I have a program that takes an input and output file as arguments. I want to make sure that the user does not specify both as the same file, because this would result is trouble. However, a simple strcmp() would only catch the most trivial cases. For example, "foo.txt", "./foo.txt" and "/full/path/foo.txt" would still be the same file, even though the strings are obviously different. There are many more examples, of course...
So, I want to convert the file names to absolute paths before comparing them. On Windows, I could use GetFullPathName() for that purpose. Is there an equivalent in POSIX API on Linux/Unix? Unfortunately, I can not use realpath(), because it only works with existing files. The output file does not usually exist, though...
Note: GetFullPathName() on Windows does not require the file to exist!
Please also note that the program is written in "plain" C, so no C++ <filesystem> is available. Also, I really want to avoid using 3rd-party libraries, otherwise something like cwalk could probably do the job...
(The best solution that I have come up with, so far, is to manually test whether the path starts with a slash. If it doesn't, then prepend the current working directory, as returned by getcwd() function)
After thinking about this a little more, I believe that I can get along with realpath() 💡
If the given input file does not exist, then we are going to fail anyway, because fopen() is going to fail to open the non-existing input file for reading. So, the only relevant case is when the input file does exist. And, in that case, realpath() will be able to resolve the absolute (canonical) path of the input file.
If we already know that the given input file exists, then still the given output file may exist or not. In case that the given output file does exist, realpath() will be able to resolve its absolute (canonical) path, so that we can compare it to the absolute path of the input file – if they are the same, we will detect it! Otherwise, if the given output file does not exist, then it obviously can not be the same as the existinginput file.
I assume there still could be some race-conditions, because the realpath() invocation and the following fopen() invocation are not "atomic". But that's probably more a "theoretical" problem...
obtain the full path of a file, we use the readlink command. readlink prints the absolute path of a symbolic link, but as a side-effect, it also prints the absolute path for a relative path
There is the readlinkprogram, form GNU CoreUtils, that does this. I use it in shell scripts. However, I'm talking about my own program, written in C, here. And, unfortunately, the readlink()syscall, which I could call from my C code, only works with actual symlinks! It fails, with EINVAL, if the given path is not a symlink.
I actually looked at the source code of the readlinkprogram to see what they are doing in order to get an absolute path, but it's a quite complex routine. Certainly not as simple as a single syscall...
is there really no function in Linux/Unix API to resolve a given path, extsing or non-existing, to an absolute path, in the exactly same way as open() would do effectively?
open() doesn't need to resolve the full path. It just follows the inodes.
I think even .st_ino is only unambiguous within a specific volume
True, but the combination of st_ino and st_dev uniquely identify the file.
So if you can stat both files then compare the st_ino and st_dev fields.
otherwise, if stat(file1) returned ENOENT then temporarily create it and compare again.
otherwise if stat(file2) returned ENOENT then temporarily create it and compare again.
Since file2 might be a symlink to file1, the "compare again" part means you need to stat both files again.
So, after all, stat() can be used to check whether two exiting files are the same (combination of st_ino and st_dev). But the same can be achieved by using realpath() and then comparing the paths. For me, the advantage of realpath() is that Windows has an equivalent _fullpath() function (except that this one handles non-existing files), whereas the Windows version of stat() does not provide meaningful st_ino values.
BTW: My code needs to work on both platforms, Windows and Unix.
BTW²: Does Unix provide meaningful st_ino values for any type of file system?
The "problematic" case is when the file does not exist, because in that case neither stat() nor realpath() will work. But, as pointed out before, in my special situation the given input file always exists, or we will fail anyway! The given output file may exist or not. But, if the output file does not exist, then we are fine. The case I need to catch is when the output file already exists and happens to be the same as the input file.