less_retarded_wiki

main page, file list (580), source, all in md+txt+html+pdf, report abuse, stats, random article, consoomer version

Memory Management

In programming memory management is (unsurprisingly) the act and various techniques of managing the working memory (RAM) of a computer, i.e. for example dividing the total physically available memory among multiple memory users such as operating system processes and assuring they don't illegally access each other's part of memory. The scope of the term may differ depending on context, but tasks falling under memory management may include e.g. memory allocation (finding and assigning blocks of free memory) and deallocation (freeing such blocks), ensuring memory safety, organizing blocks of memory and optimizing memory access (e.g. with caches or data reorganization), memory virtualization and related tasks such as address translation, handling out-of-memory exceptions etc.

Memory management can be handled at different levels: hardware units such as the MMU and CPU caches exist to perform certain time-critical memory-related tasks (such as address translation) quickly, operating system may help with memory management (e.g. implement virtual memory and offer syscalls for dynamic allocation and deallocation of memory), a programming language may do some automatic memory management (e.g. garbage collection or handling call stack) and programmer himself may do his own memory management (e.g. deciding between static and dynamic allocation or choosing the size of dynamic allocation chunk).

Why all this fuzz? As a newbie programmer who only works with simple variables and high level languages like Python that do everything for you you don't need to do much memory management yourself, but when working with data whose size may wildly differ and is not known in advance (e.g. files), someone has to handle e.g. the possibility of the data on disk not being able to fit to RAM currently allocated for your program, or -- if the data fits -- there may not be a big enough continuous chunk of memory for it. If we don't know how much memory a process will need, how much memory do we give it (too little and it may not be enough, too much and there will not be enough memory for others)? Someone has to prevent memory leaks so that your computer doesn't run out of memory due to bugs in programs. With many processes running simultaneously on a computer someone has to keep track of which process uses which part of memory and ensure collisions (one process overwriting another processe's memory) don't happen, and someone needs to make sure that if bad things happen (such as process trying to write to a memory that doesn't belong to it), they don't have catastrophic consequences like crashing or exploding the system.

Memory Management In C

In C -- a low level language -- you need to do a lot of manual memory management and there is a big danger of fucking up, especially with dynamic allocation -- C won't hold your hand (but as a reward your program will be fast and efficient), there is no uber memory safety. There is no automatic garbage collection, i.e. if you allocate memory dynamically, YOU need to keep track of it and manually free it once you're done using it, or you'll end up with memory leak.

For start let's see which kinds of allocation (and their associated parts of memory) there are in C:

Rule of the thumb: use the simplest thing possible, i.e. static allocation if you can, if not then automatic and only as the last option resort to dynamic allocation. The good news is that you mostly won't need dynamic allocation -- you basically only need it when working with data whose size can potentially be VERY big and is unknown at compile time (e.g. you need to load a WHOLE file AT ONCE which may potentially be VERY big). In other cases you can get away with static allocation (just reserving some reasonable amount of memory in advance and hope the data fits, e.g. a global array such as int myData[DATA_MAX_SIZE]) or automatic allocation if the data is reasonably small (i.e. you just create a variable sized array inside some function that processes the data). If you end up doing dynamic allocation, be careful, but it's not THAT hard to do it right (just pay more attention) and there are tools (e.g. valgrind) to help you find memory leaks. However by the principles of good design you should avoid dynamic allocation if you can, not only because of the potential for errors and worse performance, but most importantly to avoid dependencies and complexity.

For pros: you can also create your own kind of pseudo dynamic allocation in pure C if you really want to avoid using stdlib or can't use it for some reason. The idea is to allocate a big chunk of memory statically (e.g. global unsigned char myHeap[MY_HEAP_SIZE];) and then create functions for allocating and freeing blocks of this static memory (e.g. myAlloc and myFree with same signatures as malloc and free). This allows you to use memory more efficiently than if you just dumbly (is it a word?) preallocate everything statically, i.e. you may need less total memory; this may be useful e.g. on embedded. Yet another uber hack to "improve" this may be to allocate the "personal heap" on the stack instead of statically, i.e. you create something like a global pointer unsigned char *myHeapPointer; and a global variable unsigned int myHeapSize;, then somewhere at the beginning of main you compute the size myHeapSize and then create a local array myHeap[myHeapSize], then finally set the global pointer to it as myHeapPointer = myHeap; the rest remains the same (your allocation function will access the heap via the global pointer). Just watch out for reinventing wheels, bugs and that you actually don't end up with a worse mess that if you took a more simple approach. Hell, you might even try to write your own garbage collection and array bound checking and whatnot, but then why just not fuck it and use an already existing abomination like Java? :)

Finally let's see some simple code example:

#include <stdio.h>
#include <stdlib.h> // needed for dynamic allocation :(

#define MY_DATA_MAX_SIZE 1024 // if you'll ever need more, just change this and recompile

unsigned char staticMemory[MY_DATA_MAX_SIZE]; // statically allocated array :)
int simpleNumber; // this is also allocated statically :)

void myFunction(int x)
{
  static int staticNumber;  // this is allocated statically, NOT on stack
  int localNumber;          // this is allocated on stack
  int localArray[x + 1];    // variable size array, allocated on stack, hope x isn't too big

  localNumber = 2 * x;      // do something with the memory
  localArray[x] = localNumber;

  if (x > 0)                // recursively call the function
    myFunction(x - 1);
}

int main(void)
{
  int localNumberInMain = 123; // this is also allocated on stack    

  myFunction(10);  // change to 10000000 to see a probable stack overflow

  for (int i = 0; i < 200000; ++i)
  {
    if (i % 1000 == 0)
      printf("i = %d\n",i);

    unsigned char *dynamicMemory = (char *) malloc((i + 1) * 10000); // oh no, dynamic allocation, BLOAAAT!

    if (!dynamicMemory)
    {
      printf("Couldn't allocate memory, there's probably not enough of it :/");
      return 1;
    }

    dynamicMemory[i * 128] = 123; // do something with the memory

    free(dynamicMemory); // if not done, memory leak occurs! try to remove this and see :)         
  }

  return 0;
}

Powered by nothing. All content available under CC0 1.0 (public domain). Send comments and corrections to drummyfish at disroot dot org.