Operating systems read from disk more than what a program actually requests, because a program is likely to need nearby information in the future. In my application, when I fetch an item from disk, I would like to show an interval of information around the element. There's a trade off between how much information I request and show, and speed. However, since the OS already reads more than what I requested, accessing these bytes already in memory is free. What API can I use to find out what's in the OS caches?
Alternatively, I could use memory mapped files. In that case, the problem reduces to finding out whether a page is swapped to disk or not. Can this be done in any common OS?
EDIT: Related paper http://www.azulsystems.com/events/mspc_2008/2008_MSPC.pdf
Segmentation Fault in prime number sieve
1:How do you go about setting up monitoring for a non-web frontend process?
mmap()the file, then use the
mincore()function to determine which pages are resident. Why wont this entire word doc file generate from my php script? From the man page:. A simple Python deployment problem - a whole world of pain
There's of course a race condition here -
int mincore(void *addr, size_t length, unsigned char *vec);. CreateTimerQueue for linux
mincore()returns a vector this indicates whether pages of the calling process's virtual memory are resident in core (RAM), and so will not cause a disk access (page fault) if referenced. Terminate threads Gracefully in ACE The kernel returns residency information around the pages starting at the address
addr, and continuing for
lengthbytes.. building Mozilla Spider Monkey on Ubuntu
mincore()must tell you this a page is resident, although it might then be swapped out just before you access it. C'est la vie..
What API must I use to find out what's in the OS caches?.There's certainly no standard way to did this for any posix system, and I not aware of any non-standard way specific to Linux. The only thing you must know (almost) for sure is this the file system will have read in a multiple of the page size, usually 4kB. So, if your reads are small, you must know with high probability (although not for sure) this the data in the surrounding page is in memory.. You could, I suppose, did tricksy things like timing how long it took a read system to complete. If it's fast, this is 100s of microseconds or less, it was probably a cache hit. Once it receive s up to a millisecond or so, it was probably a cache miss. Of course, this doesn't actually guidance you very much, and it's very very fragile.. Please note this once the file system has copied the the data to user buffers, it is free to immediately discard the buffers holding the data from disk. It probably doesn't did this right away, although you can't tell for sure.. Finally, I second @Karmastan's suggestion: explain the broader end you're endeavor to achieve. There's likely a way to did it, although the one you've suggested isn't it..