memory allocation | Bulldozer00's Blog

Stack, Heap, Pool

September 14, 2015 bulldozer00 25 comments

Update 10/7/15: Because this post lacked a context-setting introduction, I wrote a prequel for it after the fact. Please read the prequel before reading this post.

Three Ways

The figure below shows three ways of allocating memory from within a C++ application: stack, heap, custom written pool.

Comparing Performance

The code below measures the performance of the three allocation methods. It computes and prints the time to allocate 1000, 1MB objects from: the stack, the C++ free store, and a custom-written memory pool (the implementation of Pool appears at the end of this post).

#include <chrono>
#include <iostream>
#include <array>
#include "Pool.h"

class BigObject {

public:

  //Each BigObject occupies 1MB of RAM
  static const int32_t NUM_BYTES{1000 *1024};

  void reset() {} //Required by the Pool class template

  void setFirstChar(char c) {byteArray[0] = c;}

private:

  std::array<char, NUM_BYTES> byteArray{};

};

//Compare the performance of three types of memory allocation:
//Stack, free store (heap), custom-written pool
int main() {

  using namespace std::chrono; //Reduce verbosity of subsequent code
  constexpr int32_t NUM_ALLOCS{1000};
  void processObj(BigObject* obj);

  //Time heap allocations
  auto ts = system_clock::now();
  for(int32_t i=0; i<NUM_ALLOCS; ++i) {
    BigObject* obj{new BigObject{}};
    //Do stuff.......
    processObj(obj);
  }
  auto te = system_clock::now();
  std::cout << "Heap Allocations took "
            << duration_cast<milliseconds>(te-ts).count()
            << " milliseconds"
            << std::endl;

  //Time stack allocations
  ts = system_clock::now();
  for(int32_t i=0; i<NUM_ALLOCS; ++i) {
    BigObject obj{};
    //Do stuff.......
    processObj(&obj);
  } //obj is auto-destroyed here
  te = system_clock::now();
  std::cout << "Stack Allocations took "
            << duration_cast<milliseconds>(te-ts).count()
            << " milliseconds"
            <<  std::endl;

  //Pool construction time
  ts = system_clock::now();
  Pool<BigObject, NUM_ALLOCS> objPool;
  te = system_clock::now();
  std::cout << "Pool construction took "
            << duration_cast<milliseconds>(te-ts).count()
            << " milliseconds"
            <<  std::endl;

  //Time pool allocations
  ts = system_clock::now();
  for(int32_t i=0; i<NUM_ALLOCS; ++i) {
    BigObject* poolObj{objPool.acquire()};
    //Do Stuff.......
    processObj(poolObj);
  }
  te = system_clock::now();
  std::cout << "Pool Allocations took "
            << duration_cast<milliseconds>(te-ts).count()
            << " milliseconds";

}

void processObj(BigObject* obj) {

  obj->setFirstChar('a');

}

The results for a typical program run (gcc 4.9.2, cygwin/Win10, compiler) are shown as follows:

As expected, stack allocation is much faster than heap allocation (3X in this case) because there is no such thing as a “fragmented” stack. The interesting result is how much faster custom pool allocation is than either of the two “standard” out of the box methods. Since each object is 1MB in size, and all of the memory managed by the pool is allocated only once (during construction of the pool object), dynamically acquiring and releasing pointers to the pre-constructed pool objects during runtime is blazingly fast compared to creating a lumbering 1MB object every time the user code needs one.

After you peruse the Pool implementation that follows, I’d love to hear your comments on this post. I’d also appreciate it if you could try to replicate the results I got.

The Pool Class Template – std::array Implementation

Here is my implementation of the Pool class used in the above test code. Note that the Pool class does not use any dynamically resizable STL containers in its implementation. It uses the smarter C++ implementation of the plain ole statically allocated C array. Some safety-critical domains require that no dynamic memory allocation is performed post-initialization – all memory must be allocated either at compile time or upon program initialization.

//Pool.h
#ifndef POOL_H_
#define POOL_H_

#include <array>
#include <algorithm>
#include <memory>

// The OBJ template argument supplied by the user
// must have a publicly defined default ctor and a publicly defined
// OBJ::reset() member function that cleans the object innards so that it
// can be reused again (if needed).
template<typename OBJ, int32_t NUM_OBJS>
class Pool {

public:

  //RAII: Fill up the _available array with newly allocated objects.
  //Since all objects are available for the user to "acquire"
  //on initialization, fill up the _inUse array will nullptrs.
  Pool(){

    //Fill the _available array with default-constructed objects
    std::for_each(std::begin(_available), std::end(_available),
        [](OBJ*& mem){mem = new OBJ{};});

    //Since no objects are currently in use, fill the _inUse array 
    //with nullptrs
    std::for_each(std::begin(_inUse), std::end(_inUse),
        [](OBJ*& mem){mem = nullptr;});

  }

  //Allows a user to acquire an object to manipulate as she wishes
  OBJ* acquire() {

    //Try to find an available object
    auto iter = std::find_if(std::begin(_available), std::end(_available),
        [](OBJ* mem){return (mem not_eq nullptr);});

    //Is the pool exhausted?
    if(iter == std::end(_available)) {

      return nullptr; //Replace this with a "throw" if you need to.

    }
    else {

      //Whoo hoo! An object is available for usage by the user
      OBJ* mem{*iter};

      //Mark it as in-use in the _available array
      *iter = nullptr;

      //Find a spot for it in the _inUse list
      iter = std::find(std::begin(_inUse), std::end(_inUse), nullptr);
  
      //Insert it into the _inUse list. We don't
      //have to check for iter == std::end(_inUse)
      //because we are fully in control!
      *iter = mem;

      //Finally, return the object to the user
      return mem;

    }

  }

  //Returns an object to the pool
  void release(OBJ* obj) {

    //Ensure that the object is indeed in-use. If
    //the user gives us an address that we don't own,
    //we'll have a leak and bad things can happen when we
    //delete the memory during Pool destruction.
    auto iter = std::find(std::begin(_inUse), std::end(_inUse), obj);

    //Simply return control back to the user
    //if we didn't find the object in our
    //in-use list
    if(iter == std::end(_inUse)) {

        return;

    }
    else {

      //Clear out the innards of the object being returned
      //so that the next user doesn't have to "remember" to do it.
      obj->reset();

      //temporarily hold onto the object while we mark it
      //as available in the _inUse array
       OBJ* mem{*iter};

       //Mark it!
       *iter = nullptr;

       //Find a spot for it in the _available list
       iter = std::find(std::begin(_available), std::end(_available), nullptr);

       //Insert the object's address into the _available array
       *iter = mem;
    }

  }

  ~Pool() {

    //Cleanup after ourself: RAII
    std::for_each(std::begin(_available), std::end(_available),
        std::default_delete<OBJ>());

    std::for_each(std::begin(_inUse), std::end(_inUse),
        std::default_delete<OBJ>());

  }

private:

  //Keeps track of all objects available to users
  std::array<OBJ*, NUM_OBJS> _available;

  //Keeps track of all objects in-use by users
  std::array<OBJ*, NUM_OBJS> _inUse;

};

#endif /* POOL_H_ */

The Pool Class Template – std::multiset Implementation

For grins, I also implemented the Pool class below using std::multiset to keep track of the available and in-use objects. For large numbers of objects, the acquire()/release() calls will most likely be faster than their peers in the std::array implementation, but the implementation violates the “no post-intialization dynamically allocated memory allowed” requirement – if you have one.

Since the implementation does perform dynamic memory allocation during runtime, it would probably make more sense to pass the size of the pool via the constructor rather than as a template argument. If so, then functionality can be added to allow users to resize the pool during runtime if needed.

//Pool.h
#ifndef POOL_H_
#define POOL_H_

#include <algorithm>
#include <set>
#include <memory>

// The OBJ template argument supplied by the user
// must have a publicly defined default ctor and a publicly defined
// OBJ::reset() member function that cleans the object innards so that it
// can be reused again (if needed).
template<typename OBJ, int32_t NUM_OBJS>
class Pool {

public:

  //RAII: Fill up the _available set with newly allocated objects.
  //Since all objects are available for the user to "acquire"
  //on initialization, fill up the _inUse set will nullptrs.
  Pool(){

    for(int32_t i=0; i<NUM_OBJS; ++i) {

      _available.insert(new OBJ{});
      _inUse.insert(nullptr);

    }

  }

  //Allows a user to acquire an object to manipulate as she wishes
  OBJ* acquire() {

    //Try to find an available object
    auto iter = std::find_if(std::begin(_available), std::end(_available),
        [](OBJ* mem){return (mem not_eq nullptr);});

    //Is the pool exhausted?
    if(iter == std::end(_available)) {

      return nullptr; //Replace this with a "throw" if you need to.

    }
    else {

      //Whoo hoo! An object is available for usage by the user
      OBJ* mem{*iter};

      //Mark it as in-use in the _available set
      _available.erase(iter);
      _available.insert(nullptr);

      //Find a spot for it in the _inUse set
      iter = std::find(std::begin(_inUse), std::end(_inUse), nullptr);

      //Insert it into the _inUse set. We don't
      //have to check for iter == std::end(_inUse)
      //because we are fully in control!
      _inUse.erase(iter);
      _inUse.insert(mem);

      //Finally, return the object to the user
      return mem;

    }

  }

  //Returns an object to the pool
  void release(OBJ* obj) {

    //Ensure that the object is indeed in-use. If
    //the user gives us an address that we don't own,
    //we'll have a leak and bad things can happen when we
    //delete the memory during Pool destruction.
    auto iter = std::find(std::begin(_inUse), std::end(_inUse), obj);

    //Simply return control back to the user
    //if we didn't find the object in our
    //in-use set
    if(iter == std::end(_inUse)) {

        return;

    }
    else {

      //Clear out the innards of the object being returned
      //so that the next user doesn't have to "remember" to do it.
      obj->reset();

      //temporarily hold onto the object while we mark it
      //as available in the _inUse array
       OBJ* mem{*iter};

       //Mark it!
       _inUse.erase(iter);
       _inUse.insert(nullptr);

       //Find a spot for it in the _available list
       iter = std::find(std::begin(_available), std::end(_available), nullptr);

       //Insert the object's address into the _available array
       _available.erase(iter);
       _available.insert(mem);
    }

  }

  ~Pool() {

    //Cleanup after ourself: RAII
    std::for_each(std::begin(_available), std::end(_available),
        std::default_delete<OBJ>());

    std::for_each(std::begin(_inUse), std::end(_inUse),
        std::default_delete<OBJ>());

  }

private:

  //Keeps track of all objects available to users
  std::multiset<OBJ*> _available;

  //Keeps track of all objects in-use by users
  std::multiset<OBJ*> _inUse;

};

#endif /* POOL_H_ */

Unit Testing

In case you were wondering, I used Phil Nash’s lightweight Catch unit testing framework to verify/validate the behavior of both Pool class implementations:

TEST_CASE( "Allocations/Deallocations", "[Pool]" ) {

    constexpr int32_t NUM_ALLOCS{2};
    Pool<BigObject, NUM_ALLOCS> objPool;

    BigObject* obj1{objPool.acquire()};
    REQUIRE(obj1 not_eq nullptr);

    BigObject* obj2 = objPool.acquire();
    REQUIRE(obj2 not_eq nullptr);

    objPool.release(obj1);
    BigObject* obj3 = objPool.acquire();
    REQUIRE(obj3 == obj1);

    BigObject* obj4 = objPool.acquire();
    REQUIRE(obj4 == nullptr);

    objPool.release(obj2);
    BigObject* obj5 = objPool.acquire();
    REQUIRE(obj5 == obj2);

}

==============================================
All tests passed (5 assertions in 1 test case)

C++11 Features

For fun, I listed the C++11 features I used in the source code:

* auto

* std::array<T, int>

* std::begin(), std::end()

* nullptr

* lambda functions

* Brace initialization { }

* std::chrono::system_clock, std::chrono::duration_cast<T>

* std::default_delete<T>

Did I miss any features/classes? How would the C++03 implementations stack up against these C++11 implementations in terms of readability/understandability?