stingray's blog

By stingray, 14 years ago, In English

The memory accounting is the part of programming contest environment which is almost never done right. And even if it's done right, it doesn't seems so to the extent that people will complain about it. A lot.

So let's start with choosing our platform first (Windows), and briefly describe the most common mistakes people do (aside from saying "contest platform is simple, I'll write one in no time").

Chapter 1, in which we learn how to do GetProcessMemoryInfo

CreateProcess (and similar functions) give us this handle, which we can later use in GetProcessMemoryInfo call, which will in turn give us PROCESS_MEMORY_COUNTERS structure filled with useful data. So, the very first mistake would be to use PeakWorkingSetSize field to measure the memory usage.

This is plain wrong.

"Working set" is the set of physical pages in memory that exist in the address space of the process. So in other words, if the process allocated tons of memory but it's all swapped out - you won't see it. If the process did not allocate anything but is for some reason linked to a DLL with huge .DATA segment, and touched lots of pages - you will probably see it. Also, Windows VM tries to do tricks to a working set - you can put arbitrary per-process limits on it, and of course when memory pressure is above some threshold your data can be swapped out and pages will be discarded from it. Bottom line, measuring process memory usage by looking at the working set will get you nowhere.

Chapter 2, in which we learn to access the correct fields in PROCESS_MEMORY_COUNTERS

Windows memory management has this wonderful no-overcommit property, which means that if you enabled pagefile, then all memory you allocate will be backed by it. Or, from a different angle, every page in your address space is either represented by kernel (and you don't care), backed by some file somewhere (like your binary image), or backed by a file with handle -1 (that's somewhere in pagefile). So here's what you need to look at: PeakPagefileUsage.

This will give you the correct value for anonymous memory (backed by the pagefile). But you still have to guard yourself against cheaters, who will create a file and use CreateFileMapping to obtain some memory which will be not accounted as PagefileUsage.

Chapter 3, in which we learn Java

This chapter is not unique to Java. It is common to any implementation that maintains its own heap. Low-level windows memory management (VirtualAlloc and friends) are giving out memory in pages. So, to allocate arbitrary structures, everything uses some kind of heap manager. Windows has its own one (HeapAlloc and friends), but some other languages and runtimes may decide to implement theirs. For a lazy heap implementation, or some other kind which tries to reduce fragmentation (like the firefox one) the observed memory usage can be much worse than the real one (from the high-level programmer point of view).

This is especially true with Java as it's not being run directly on the platform and includes its own sophisticated memory management with garbage collection and whatnot. So, you cannot trust the numbers Windows gives you for Java binaries. What you have to do instead is to limit size of the Java heap to your memory limit, and catch the OutOfMemory exception for solutions.

As an additional improvement, you may want to create a JVM agent and inject it into your process, to monitor and collect the memory usage of your Java process inside the JVM, because windows will always report that it used all memory you gave it.

Chapter 4, in which we don't want job objects

Job objects are nice when you want to isolate your processes from each other and prevent the whole system from exploding. Unfortunately they are not that good in our case. What they do when you limit the memory size is they deny memory allocations by forcing alloc and realloc (and mmap fwiw) functions to return NULL. And in modern computing, almost nothing handles this properly, including Windows itself. So, if you put a limit too tight, and your application has a huge data segment which inflates when system is initially trying to mmap your binary, the app won't even load, CreateProcess will fail with a bizarre error message in MessageBox (obviously we want someone to sit in front of the server every time we run out of memory quota, thank you Microsoft). So, while we're OK with using job objects to isolate UI elements and do a post-execution accounting of launched processes, we don't really want to use memory limits there.

Conclusion, in which we write contest platform in no time

Thanks to this writeup, we have successfully clarified this tricky and obscure thing - memory limits. So, I guess, we're quite ready to write our own contester? Of course, it will be Windows-only, will launch everything as an admin user, won't do process time accounting, ...

Or you can watch this blog for a next one, in which we'll discuss one of:

  • memory accounting on Linux
  • CPU time accounting in Windows and Linux
  • process isolation in Windows and Linux
  • something else. 
  • Vote: I like it
  • +12
  • Vote: I do not like it