I think you should be able to get rid of most of the undocumented API usage with the newer CreateFileMapping2/MapViewOfFile3 APIs. Though that does require a higher minimum OS version and the crucial NtExtendSection function still has no documented equivalent as far as I can tell, so it's kind of a wash.
Also, you can link against ntdll.lib directly. Manually calling GetProcAddress for a few functions isn't a tragedy by any means, but in this case, why bother?
Great article nonetheless!
A memory mapped file can be used to store complex object oriented structures if you make use of 'relative' pointers, where you have to add the address of the pointer to the pointer to get to the object you are pointing to. I once used this method, to persist a complex data structure without having to write serialization code.
A long time ago, I implemented some C++ classes that would hide most of the additional work and also took care of allocating new objects inside the memory mapped file. See: https://www.iwriteiam.nl/D0205.html#13MMF (Note that this implementation actually makes use of a slightly different approach where pointers are relative with respect to the first position of the file. This implementation has the limitation that you can only open one such store as a memory mapped file. It was only later I realized that it was possible to do without the offset. I never came to rewriting all the code.)
Boost.Container [1] has reimplementations of the STL containers that use offsets to be compatible with memory-mapped files. Boost.Interprocess [2] has some other useful types, such as smart pointers that are compatible with memory-mapped files, along with platform-independent APIs for handling them.
[1] https://www.boost.org/doc/libs/1_86_0/doc/html/container.htm...
[2] https://www.boost.org/doc/libs/1_86_0/doc/html/interprocess....
> complex object oriented structures if you make use of 'relative' pointers
You just described the original MsOffice file formats (and given many people who ever tried to parse them a PTSD shock)Most stl classes will work just fine with smart pointers. For example, std::vector takes a custom allocator. You can have the allocators pointer type be a custom pointer class. The pointer class is a wrapper over an offset that is the same size as the native word.
Then the operator-> is just (this + offset).
I've used this approach successfully in many projects. If you're worried about differing stls(probably a reasonable worry here), then use any of the boost containers.
> A legitimate problem here is the case where you need to read from or write to many locations at once. With the memory-mapped scheme as described, you can only issue as many concurrent page faults as you have threads in the program.
> Unfortunately, I don’t have great answers here, and view this as a legitimate use case that warrants a dedicated code path using other mechanisms at your disposal.
Right. I think it’s similar on linux. Mmap makes sense until the data size is too large and tlb misses start to add up.
Why would the TLB misses be any different than if you read the file into memory?
The article is describing a separate problem where you can’t issue concurrent reads although that doesn’t feel true if you make use of madvise.
Article assumes that reads/writes can only ever fail if working with remote files.