Index structure optimizations for in-memory data storage systems

Index structure optimizations for in-memory data storage systems

Abstract. In-memory data grid systems uses RAM as main storage device, which takes the performance of such systems on the new level. However, new in-memory systems often uses old data structures and algorithms which were proposed for relational database management systems and are not suitable for main memory, in particular, they do not take into account number of cache misses. This paper describes methods of index structure optimizations for in-memory storage solutions, which will allow to significantly reduce the number of comparatively slow reads from main memory.

Keywords: in-memory data grid, deleting key fields, rearrange fields.

Oleksandr Podrubailo, Vasyl Saverchenko

Computing techniques dept Igor Sikosky KPI

Kyiv, Ukraine


Modern information technology requires the availability of reliable, fast and convenient, data storage systems. In-memory data grid (IMDG) is potentially able to meet these requirements. The concept of IMDG does not coincide with that of traditional databases on long-term media. The main idea of such grids is the placement of data in distributed maps (so-called caches or regions) that are completely located in RAM. This allows to reduce the time of information access. Regions of distributed storage contain objects. This type of storage eliminates the costs of marshal-ling/unmarshalling when working with data in object-oriented environments. However, due to its relative novelty, the technology of distributed storage in RAM has a number of drawbacks [1]. Among them, there are inefficient data structures that are used to build IMDG indexes. [2] Over the last 10-15 years there have been many methods developed to optimize cache utilization in various areas: database [3,4], a packages of linear algebra [5,6], virtual machines and the garbage collector [7], the support software for net-work routers [8], etc. Here we propose meta-classification of common techniques to optimize the use of cache memory. We will distinguish between methods of optimization whether their application fields: algorithms or data structures. Methods from first type re-factors algorithms so that the resulting algorithms gets minimal number of cache misses due to reduction of the algorithm working set . Second type methods improve cache utilization by changing the data structure, the algorithms on this data structure (e.g., search algorithms) remains the same. In this paper we describe data structure changes which will make IMDG indexes more effective.

II. DATA STRUCTURE OPTIMIZATIONS The methods discussed in this paper assume that data structure consists of nodes. Links between nodes can be implemented by pointers. The purpose of all considered methods is to reduce the number of cache misses that occur when the scanning algorithm is executed, the generalized pseudocode of which is shown in Listing 1.

$00$ void dataScan (Node start) {

$01$ Node current = start;

$02$ while (current! = null) {

$03$ current = getNext (current);

$04$ }

$05$ }

Listing 1. A generalized algorithm for scanning a data structure

The main in this algorithm is a function getNext, which analyzes the current node and returns a null, if the search is complete, or the next node otherwise. Note that the code above is summarized key search in any data structure, hierarchical like B- and B+ tree, or search in the hash table collisions list, as well as view the array. In addition, since current changes at each iteration, and the Node size is larger than the cache block, address to current is always a reference to the new cache block. All methods of optimizing data structures are characterized the locali-ty of their usage: these methods reduce cache misses for the first call under the assumption that there is no results of previous searches in the cache.


We can achieve improvements by optimizing the use of cache block by Node. It is obvious that the smaller the size of Node gives us advantages in terms of the use of the cache block. Consider possible ways to optimize the use of cache block by Node. We assume that the Node = (field1, ..., fieldN ), where N – number of fields in Node. Fields is either the key field or a pointer to another Node. We assume that the key fields constitute the information part of the node, and pointers are needed to support the data structure.

A. Deleting key fields

Removing the key fields (fluff extraction) is proposed in [9]. Key fields that are not used in getNext functions only occupy space in the cache block and may be removed from the assembly, for example by removal into a separate data structure. Depending on the specific situation, this optimization may or may not be applicable, since it leads to the appearance of a second data structure, the connection in which links are completely repeated in the original one. Obviously, two structures occupy, in general, more memory space than one, but in some cases such conversions are beneficial.Example. Consider the ratio Persons (Age int, Name char (30), Position char (30)). Assume that the relationship is stored as an array of records and pointers are absent, i.e. Node = (Age, Name, Position). getNext is used for calculating the average age of persons, so, a single field in Node, Age, is necessary for getNext. If cache block size is 128 bytes, then one cache block can hold almost 2 nodes. Thus, during the scan occurs about |Persons|/2 of first call cache misses.We can move Name and Position fields in a separate array. Then Node = (Age), and one cache block can hold 32 nodes. The resulting number of misses of the first circulation thus decreases by more than 10 times.

B. Rearrange fields

The regrouping of fields (field reordering) is applicable in a wide range of cases, the removal of the key fields. It is not always possible to make unused Examine-Node field in a separate structure. In this case, you can try to rearrange the fields in such a way that the fields used at close moments of time are nearby in the node. This transformation concerns not only key fields, but also pointers.

Example 1. Consider another embodiment relationship Students (PassportNumber char (10), Year int, FirstName char (20), LastName char (20), Address char (50), PhoneNumber char (8), RecordBookNumber char (6), AvgGrade int), where the Year and AvgGrade - a course on which the student learns, and the average score for the current academic year, respectively. Obviously, the size of the record is 122 bytes. If the cache block size is 64, the recording takes almost two cache blocks. Assume that the scanning ratio, during which the calculated average scores of students who are studying various courses (ExamineNode in this case uses field Year a AvgGrade). since the Year a AvgGrade are in different cache blocks, ExamineNode function is, generally speaking, the two cache miss. It is therefore advantageous to reorganize the record so that Year AvgGrade and are close, e.g., Students = (PassportNumber, Year, AvgGradee, FirstName, LastName Address, PhoneNumber, RecordBookNumber). In the process under consideration operation dataScan will use only the first cache block during getNext and will lead to a single cache miss.

Example 2. Consider node of k-ary tree. In this case, the Node = (ptr0, key1, ptr1, ..., keyk, ptrk), where keyi - key field, and ptri - pointers to child nodes, and key0

C. Compression of key fields

Compression of key fields allows reducing size of key fields and, consequently, the number of occupied node cache blocks (or putting more nodes in a cache block). Often, the key fields are non-negative integers, which are located in the node in ascending order - for example, as was discussed in the previous example k-ary tree. In this case, one of the methods of compression of the increasing sequence of non-negative integers considered in [10] can be applied. Such methods are based on the fact that instead storage of real number, the difference between the number and previous number, which is the (presumably small) positive number, is stored. This difference is encoded using one of coding methods, like γ- or σ-encoding [10]. The disadvantage of this method is that after its application, key fields can only be accessed for sequential viewing, since the previous value is required to decode the next value. In addition, the coded numbers can take a non-integer number of bytes, and access to them requires a relatively slow on modern processors bit shift instructions and bitwise AND. As a means of correcting the last drawback, you can align the encoded numbers to byte boundaries, but this reduces the effect of compression [11]. Another method of compression, which is used in situations where key fields are large and variable in length (in particular, are string values), is the separation from each key value of the prefix of fixed length. Long key node in this case is replaced by a prefix and a pointer to the actual key: keyi => (prefixi , key_pointeri) (note that the key_pointeri is the information field although it contains a pointer to a string). The reason for this is the fact that in many cases, instead of a key, you can use a prefix - for example, when comparing strings, you can compare prefixes, and refer to the lines themselves only if the prefixes match.

Example 1. Consider again the node of 5-ary tree Node = (1980, 1998, 2002, 2004, 2005, ptr0, ptr1, ptr2, ptr3, ptr4, ptr5). Using compression integers, we transform the sequence of key values (1980, 18 4 2 1). When using coding integers alignment is byte for each key, starting with the second is 1 byte, so all we have saved 4 (sizeof (int) - 1) = 12 bytes.

Example 2. Let us assume key values as strings. Consider the 3-ary tree node:

Node = ( 'Ivanovsky', ' Lebedev', 'Pavlovsky', 'Petrovsky', ptr 0, ptr1, ptr2, ptr3, ptr4). Instead of rows will use the two-letter prefixes: keyi => (prefixi, key_pointeri). We’ll get: Node = ( 'Iv', 'Le,' Pa, 'Pe', key_ptr1, key_ptr2,key_ptr3 , key_ptr4 , ptr 0, ptr1, ptr2, ptr3, ptr4) As a result, we can save 3 * 9 + 7 - (3 * 2 + 3 * 4 ) = 16 bytes per node, in addition, comparing short prefixes is faster than the original rows.

D. Removing pointers

Typically, pointer fields are not used in getNext function - algorithm for the next iteration of the loop is addressed to only one of the node pointers, which is selected as a result of getNext call. By decreasing the number of pointers, you can reduce the number of cache blocks occupied by the node. Note that pointers are only needed to maintain links in the structure, so they can be replaced by another way of maintaining these links - in particular, their calculation. As a rule, the calculation of links is based on the fact that, if there is a certain regularity in the structure, the address of the child node can be calculated from the address of the parent node and the serial number of the child node among the child nodes of the parent node.

Example 1. Consider k-ary tree, Node = (key0, … , keyk, ptr1, … , ptrk). Assume that all child nodes are arranged in a sequence memory so ptri = ptr0 + i * sizeof (Node). In this case the assembly can be removed from all pointers except ptr0, thus saving (k - 1) * sizeof (ptr) = 4 (k - 1) bytes within the cache block. Of course, this requires special support for the modification of the tree, in particular, when adding new nodes and increases the total amount of memory occupied by the tree.

Example 2. In conditions of the previous example, suppose that the tree is complete and stored in memory so that all levels and all nodes on the same level are arranged sequentially. Offset of unit i at level l (l> 1) comparatively to the tree addrO in memory in this case is equal to (1 + k + ... + kl -1 + i) * sizeof (Node). This makes it possible, knowing the address of current in the algorithm 1, to determine the level l of unit current, that allows you to find the address of the child node, knowing its serial number among child nodes current. In order to avoid calculating the level l at each iteration of the algorithm 1, it can be maintained as an additional i loop variable, increasing it at each iteration. Thus, in this situation, storage of pointers to child nodes is not required at all, however such a data structure is unsuitable if new keys can be added to the tree.


In addition to optimizing the structure of nodes, optimization of the relative location of nodes can also be used. Typically, this optimization is performed after the node structure is improved because only small node size makes it possible to successfully optimize the relative location of the nodes. The purpose of such optimization is to reduce number of cache misses in each iteration of the algorithm 1. Obviously, the way to achieve this is to place the nodes so that the current and new nodes current are located within a single cache block. Of course, this optimization is possible only for small node size: sizeof (Node)

Example. Consider a 3-ary tree whose node key has two integer values: Node = (key0, key1, ptr0, ptr1, ptr2). sizeof (Node) = 2 * 4 + 3 * 4 = 20 and segment size is 128. Then we can group the nodes of even levels with their child nodes, so that the node and all its child nodes are stored sequentially in memory. In Figure 1, the groups of nodes are shown using dotted lines. The number of cache misses when searching for a key in such a tree is reduced by an average of 2 times, since cache misses do not occur when moving from an even level to an odd level. However, for a similar 4-ary tree where sizeof (Node) = 3 * 4 + 4 * 4 = 28, the node and its children will not fit in a single cache block. Nevertheless, described optimization is still effective in this case, since the probability of use getNext that part of the last node that does not fit in the cache block is small.


Showed the set of methods, which can be used in distributed in-memory date storage to reduce the number of cache misses during the index search. All of them requires refactor of index structure, which may be not suitable for existing data storages with large volume of legacy code and necessity to provide the backward compatibility with old versions. However, usage of optimized index structured can be good solution for ne in-memory data grids, because reducing of cache misses significantly decreases the usage of RAM, which is much slower than processor cache.


$[1]$ Busowski O., Podrubailo O. Actual problems of in-memory data grid. // News of the National Aviation Administration. Problems of information and management - 2015 - №2(50) 2015 - P. 36-43 (in Ukrainian)

$[2]$ Shaporenkov D. Effective methods of data indexing and querying in the database control systems in main memory – PhD research - St. Petersburg, 2006 (in Russian)

$[3]$ Rao J., Ross K. A. Cache Conscious Indexing for Decision-Support in Main Memory. // Proc. of the VLDB 1999 Conference. — Morgan Kaufmann, 1999. - P. 78-89.

$[4]$ Rao J., Ross K. A. Making B -Trees Cache Conscious in Main Memory. // Proc. of the SIGMOD 2000 Conference. - ACM, 2000. - P 475-486.

$[5]$ Nonlinear Array Layouts for Hierarchical Memory Systems. / S. Chatterjee, V. V. Jain, A. R. Lebeck et al. // Proc, of the ICS 1999 Conference. - 1999. - P. 444-453.

$[6]$ Chatterjee S., Sen S. Cache-Efficient Matrix Transposition. // Proc. of the HPCA 2000 Conference. - 2000. - P. 195-205.

$[7]$ Chilimbi T. M.. Parus J. R. Using Generational Garbage Collection To Implement Cache-Conscious Data Placement. // Proc. of the ISMM 1998 Conference. - 1998. - P. 37-48.

$[8]$ P. Patros, D. Dilli, K. B. Kent, M. Dawson, T. Watson, "Multitenancy benefits in application servers", Proceedings of the 25th Annual International Conference on Computer Science and Software Engineering, P. 111-118, 2015.

$[9]$ A. Sabeeh et al., "A hybrid intelligent approach for optimising software-defined networks performance", Proc. ICICM, P. 47-51, Oct. 2016.

$[10]$ Witten I., Moffat A., Bell T. Managing Gigabytes : Compressing and Indexing Documents and Images. — 2nd edition. — Morgan Kaufmann, 1999.

$[11]$ Data Compression in Full-Text Retrieval Systems. / T. C. Bell, A. Moffat, G. Nevill-Manning et al. // -IASIS. — 1993. — Vol. 44, no. 9. — P. 508 531

May 17, 2018