Memory-mapped file
A memory-mapped file is a segment of virtual memory[1] that has been assigned a direct byte-for-byte correlation with some portion of a file or file-like resource.
This resource is typically a file that is physically present on disk, but can also be a device, shared memory object, or other resource that the operating system can reference through a file descriptor.
Once present, this correlation between the file and the memory space permits applications to treat the mapped portion as if it were primary memory.
Benefits
The benefit of memory mapping a file is increasing I/O performance, especially when used on large files. For small files, memory-mapped files can result in a waste of slack space[7] as memory maps are always aligned to the page size, which is mostly 4 KiB. Therefore, a 5 KiB file will allocate 8 KiB and thus 3 KiB are wasted.
Accessing memory mapped files is faster than using direct read and write operations for two reasons. Firstly, a system call is orders of magnitude slower than a simple change to a program’s local memory. Secondly, in most operating systems the memory region mapped actually is the kernel’s page cache (file cache), meaning that no copies need to be created in user space.
Certain application-level memory-mapped file operations also perform better than their physical file counterparts. Applications can access and update data in the file directly and in-place, as opposed to seeking from the start of the file or rewriting the entire edited contents to a temporary location. Since the memory-mapped file is handled internally in pages, linear file access (as seen, for example, in flat file data storage or configuration files) requires disk access only when a new page boundary is crossed, and can write larger sections of the file to disk in a single operation.
A possible benefit of memory-mapped files is a “lazy loading“, thus using small amounts of RAM even for a very large file. Trying to load the entire contents of a file that is significantly larger than the amount of memory available can cause severe thrashing as the operating system reads from disk into memory and simultaneously writes pages from memory back to disk. Memory-mapping may not only bypass the page file completely, but also allow smaller page-sized sections to be loaded as data is being edited, similarly to demand paging used for programs.
The memory mapping process is handled by the virtual memory manager, which is the same subsystem responsible for dealing with the page file. Memory mapped files are loaded into memory one entire page at a time. The page size is selected by the operating system for maximum performance. Since page file management is one of the most critical elements of a virtual memory system, loading page sized sections of a file into physical memory is typically a very highly optimized system function.[8]
Drawbacks
The major reason to choose memory mapped file I/O is performance. Nevertheless, there can be tradeoffs. The standard I/O approach is costly due to system call overhead and memory copying. The memory-mapped approach has its cost in minor page faults—when a block of data is loaded in page cache, but is not yet mapped into the process’s virtual memory space. In some circumstances, memory mapped file I/O can be substantially slower than standard file I/O.[10]
Another drawback of memory-mapped files relates to a given architecture’s address space: a file larger than the addressable space can have only portions mapped at a time, complicating reading it. For example, a 32-bit architecture such as Intel’s IA-32 can only directly address 4 GiB or smaller portions of files. An even smaller amount of addressable space is available to individual programs—typically in the range of 2 to 3 GiB, depending on the operating system kernel. This drawback however is virtually eliminated on modern 64-bit architecture.
mmap also tends to be less scalable than standard means of file I/O, since many operating systems, including Linux, has a cap on the number of cores handling page faults. Extremely fast devices, such as modern NVM Express SSDs, are capable of making the overhead a real concern.[11]
I/O errors on the underlying file (e.g. its removable drive is unplugged or optical media is ejected, disk full when writing, etc.) while accessing its mapped memory are reported to the application as the SIGSEGV/SIGBUS signals on POSIX, and the EXECUTE_IN_PAGE_ERROR structured exception on Windows. All code accessing mapped memory must be prepared to handle these errors, which don’t normally occur when accessing memory.
Only hardware architectures with an MMU can support memory-mapped files. On architectures without an MMU, the operating system can copy the entire file into memory when the request to map it is made, but this is extremely wasteful and slow if only a little bit of the file will be accessed, and can only work for files that will fit in available memory.
Common uses
Perhaps the most common use for a memory-mapped file is the process loader in most modern operating systems (including Microsoft Windows and Unix-like systems.) When a process is started, the operating system uses a memory mapped file to bring the executable file, along with any loadable modules, into memory for execution. Most memory-mapping systems use a technique called demand paging, where the file is loaded into physical memory in subsets (one page each), and only when that page is actually referenced.[12] In the specific case of executable files, this permits the OS to selectively load only those portions of a process image that actually need to execute.
Another common use for memory-mapped files is to share memory between multiple processes. In modern protected mode operating systems, processes are generally not permitted to access memory space that is allocated for use by another process. (A program’s attempt to do so causes invalid page faults or segmentation violations.) There are a number of techniques available to safely share memory, and memory-mapped file I/O is one of the most popular. Two or more applications can simultaneously map a single physical file into memory and access this memory. For example, the Microsoft Windows operating system provides a mechanism for applications to memory-map a shared segment of the system’s page file itself and share data via this section.
mmap munmap
https://man7.org/linux/man-pages/man2/mmap.2.html
SYNOPSIS
#include <sys/mman.h>
void *mmap(void *addr, size_t length, int prot, int flags,
int fd, off_t offset);
int munmap(void *addr, size_t length);
mmap: DESCRIPTION
mmap() creates a new mapping in the virtual address space of the
calling process. The starting address(virtual address) for the new mapping is
specified in addr
. The length
argument specifies** the length of
the mapping** (which must be greater than 0).
If addr
is NULL, then the kernel chooses the (page-aligned)
address(virtual address) at which to create the mapping; this is the most portable
method of creating a new mapping. If addr
is not NULL, then the
kernel takes it as a hint about where to place the mapping; on
Linux, the kernel will pick a nearby page boundary (but always
above or equal to the value specified by
/proc/sys/vm/mmap_min_addr) and attempt to create the mapping
there. If another mapping already exists there, the kernel picks
a new address that may or may not depend on the hint. The
address of the new mapping is returned as the result of the call.
The contents of a file mapping (as opposed to an anonymous
mapping; see MAP_ANONYMOUS below), are initialized using length
bytes starting at offset
offset in the file (or other object)
referred to by the file descriptor fd
. offset
must be a multiple
of the page size as returned by sysconf(_SC_PAGE_SIZE)
.
After the mmap() call has returned, the file descriptor, fd, can
be closed immediately without invalidating the mapping.
The prot
argument describes the desired memory protection of the
mapping (and must not conflict with the open mode of the file).
It is either PROT_NONE or the bitwise OR of one or more of the
following flags:
PROT_EXEC
Pages may be executed.
PROT_READ
Pages may be read.
PROT_WRITE
Pages may be written.
PROT_NONE
Pages may not be accessed.
mmap: The flags
argument
The flags argument determines whether updates to the mapping are
visible to other processes mapping the same region, and whether
updates are carried through to the underlying file. This
behavior is determined by including exactly one of the following
values in flags:
MAP_SHARED
Share this mapping. Updates to the mapping are visible to
other processes mapping the same region, and (in the case
of file-backed mappings) are carried through to the
underlying file. (To precisely control when updates are
carried through to the underlying file requires the use of
msync(2).)
MAP_SHARED_VALIDATE (since Linux 4.15)
This flag provides the same behavior as MAP_SHARED except
that MAP_SHARED mappings ignore unknown flags in flags.
By contrast, when creating a mapping using
MAP_SHARED_VALIDATE, the kernel verifies all passed flags
are known and fails the mapping with the error EOPNOTSUPP
for unknown flags. This mapping type is also required to
be able to use some mapping flags (e.g., MAP_SYNC).
MAP_PRIVATE
Create a private copy-on-write mapping. Updates to the
mapping are not visible to other processes mapping the
same file, and are not carried through to the underlying
file. It is unspecified whether changes made to the file
after the mmap() call are visible in the mapped region.
munmap
The munmap() system call deletes the mappings for the specified
address range, and causes further references to addresses within
the range to generate invalid memory references. The region is
also automatically unmapped when the process is terminated. On
the other hand, closing the file descriptor does not unmap the
region.
The address addr must be a multiple of the page size (but length
need not be). All pages containing a part of the indicated range
are unmapped, and subsequent references to these pages will
generate SIGSEGV. It is not an error if the indicated range does
not contain any mapped pages.
Does mmap copy data to the memory?
c - Does mmap really copy data to the memory? - Stack Overflow
The only thing the mmap
function really does is change some kernel data structures
, and possibly the page table. It doesn’t actually put anything into physical memory at all. After you call mmap
, the allocated region probably doesn’t even point to physical memory: accessing it will cause a page fault. This kind of page fault is transparently handled by the kernel
mmap() vs. reading blocks
using
mmap()
versus reading in blocks via C++’sfstream
library
compare
- A call to
mmap
hasmore overhead
thanread
(just likeepoll
has more overhead thanpoll
, which has more overhead thanread
). Changing virtual memory mappings is a quite expensive operation on some processors for the same reasons that switching between different processes is expensive. - The IO system can already use the disk cache, so if you read a file, you’ll hit the cache or miss it no matter what method you use.
However,
- Memory maps are generally
faster for random access
, especially if your access patterns are sparse and unpredictable. - Memory maps allow you to keep using pages from the cache until you are done. This means that
if you use a file heavily for a long period of time, then close it and reopen it, the pages will still be cached
. Withread
, your file may havebeen flushed from the cache
ages ago. This does not apply if you use a file and immediately discard it. (If you try tomlock
pages just to keep them in cache, you are trying to outsmart the disk cache and this kind of foolery rarely helps system performance). - Reading a file directly is very simple and fast.
The discussion of mmap/read reminds me of two other performance discussions:
- Some Java programmers were shocked to discover that
nonblocking I/O is often slower than blocking I/O
, which made perfect sense if you know thatnonblocking I/O requires making more syscalls
. - Some other network programmers were shocked to learn that
epoll
is often slower thanpoll
, which makes perfect sense if you know that managingepoll
requires making more syscalls.
Conclusion
Use memory maps if you access data randomly, keep it around for a long time
, or if you know you can share it with other processes
(MAP_SHARED
isn’t very interesting if there is no actual sharing). Read files normally if you access data sequentially or discard it after reading
. And if either method makes your program less complex, do that. For many real world cases there’s no sure way to show one is faster without testing your actual application and NOT a benchmark.
When should I use mmap for file access?
POSIX environments provide at least two ways of accessing files. There’s the standard system calls
open()
,read()
,write()
, and friends, but there’s also the option of usingmmap()
to map the file into virtual memory.When is it preferable to use one over the other?
benefit and when to use
mmap
is great if you have multiple processes accessing data in a read only
fashion from the same file
. mmap
allows all those processes to share the same physical memory pages
, saving a lot of memory.
mmap
also allows the operating system to optimize paging operations. For example, consider two programs; program A
which reads in a 1MB
file into a buffer creating with malloc
, and program B which mmaps
the 1MB file into memory. If the operating system has to swap part of A
‘s memory out, it must write the contents of the buffer to swap before it can reuse the memory. In B
‘s case any unmodified mmap
‘d pages can be reused immediately because the OS knows how to restore them from the existing file they were mmap
‘d from. (The OS can detect which pages are unmodified by initially marking writable mmap
‘d pages as read only and catching seg faults, similar to Copy on Write strategy).
mmap
is also useful for inter process communication. You can mmap
a file as read / write in the processes that need to communicate and then use synchronization primitives in the mmap'd
region (this is what the MAP_HASSEMAPHORE
flag is for).
awkwardness
One place mmap
can be awkward is if you need to work with very large files on a 32 bit machine. This is because mmap
has to find a contiguous block of addresses in your process’s address space that is large enough to fit the entire range of the file being mapped. This can become a problem if your address space becomes fragmented, where you might have 2 GB of address space free, but no individual range of it can fit a 1 GB file mapping. In this case you may have to map the file in smaller chunks than you would like to make it fit.
Another potential awkwardness with mmap
as a replacement for read / write is that you have to start your mapping on offsets of the page size. If you just want to get some data at offset X
you will need to fixup that offset so it’s compatible with mmap
.
And finally, read / write are the only way you can work with some types of files. mmap
can’t be used on things like pipes and ttys.