The GT.M database structure is hierarchical, based on a form of balanced tree called a B-star tree (B*-tree) structure. The B*-tree contains blocks that are either index or data blocks. An index block contains pointers used to locate data in data blocks, while the data blocks actually store the data. Each block contains a header and records. Each record contains a key and data.

GDS structures the data into multiple B*-trees. GT.M creates a new B*-tree, called a Global Variable Tree (GVT), each time the application defines a new named global variable. Each GVT stores the data for one named global, that is all global variables (gvn) that share the same unsubscripted global name. For example, global ^A, ^A(1), ^A(2), ^A("A"), and ^A("B") are stored in the same GVT. Note that each of these globals share the same unsubscripted global name, that is, ^A. A GVT contains both index and data blocks and can span several levels. The data blocks contain actual global variable values, while the index blocks point to the next level of block.

At the root of the B*-tree structure is a special GDS tree called a Directory Tree (DT). DT contains pointers to the GVT. A data block in the DT contains an unsubscripted global variable name and a pointer to the root block of that global variable's GVT.

All GDS blocks in the trees have level numbers. Level zero (0) identifies the terminal nodes (that is, data blocks). Levels greater than zero (0) identify non-terminal nodes (that is, index blocks). The highest level of each tree identifies the root. All the B*-trees have the same structure. Block one (1) of the database always holds the root block of the Directory Tree.

The following illustration describes the internal GDS B*-tree framework GT.M uses to store globals.

GT.M creates a new GVT when a SET results in the first use of an unsubscripted global name by referring to a subscripted or unsubscripted global variable with a name prefix that has not previously appeared in the database.

[Important]Important

GVTs continue to exist even after all nodes associated with their unsubscripted name are KILLed. An empty GVT occupies negligible space and does not affect GT.M performance. However, if you are facing performance issues because you have many empty GVTs, you need to reorganize your database file using MUPIP EXTRACT, followed by MUPIP CREATE, and the MUPIP LOAD to remove those empty GVTs.

The following sections describe the details of the database structures.

Records consist of a record header, a key, and either a block pointer or the actual value of a global variable name (gvn). Records are also referred to as nodes.

The record header has two fields that contain information. The first field, of two bytes, specifies the record size. The second field, of one byte, specifies the compression count.

[Note]Note

Depending on the platform an extra byte may be added to the compression count, allowing compression counts of up to 1020.

The interpreted form of a block with global ^A("Name",1)="Brad" looks like the following:

Rec:1  Blk 3  Off 10  Size 14  Cmpc 0  Key ^A("Name",1) 
      10 : | 14  0  0 61 41  0 FF 4E 61 6D 65  0 BF 11  0  0 42 72 61 64| 
           |  .  .  .  a  A  .  .  N  a  m  e  .  .  .  .  .  B  r  a  d| 

The data portion of a record in any index block consists of a four-byte block pointer. Level 0 data in the Directory Tree also consists of four-byte block pointers. Level 0 data in Global Variable Trees consists of the actual values for global variable names.

GT.M stores string subscripts as a variable length sequence of 8-bit codes ranging from 0 to 255. With UTF-8 specified at process startup, GT.M stores string subscripts as a variable length sequence of 8-bit codes with UTF-8 encoding.

To distinguish strings from numerics while preserving collation sequence, GT.M adds a byte containing hexadecimal FF to the front of all string subscripts. The interpreted form of the global variable ^A("Name",1)="Brad" looks like the following:

Block 3   Size 24   Level 0   TN 1 V5 
  
Rec:1  Blk 3  Off 10  Size 14  Cmpc 0  Key ^A("Name",1) 
      10 : | 14  0  0 61 41  0 FF 4E 61 6D 65  0 BF 11  0  0 42 72 61 64| 
           |  .  .  .  a  A  .  .  N  a  m  e  .  .  .  .  .  B  r  a  d| 

Note that hexadecimal FF is in front of the subscript "Name". GT.M permits the use of the full range of legal characters in keys. Therefore, a null (ASCII 0) is an acceptable character in a string. GT.M handles strings with embedded nulls by mapping 0x00 to 0x0101 and 0x01 to 0x0102. GT.M treats 0x01 as an escape code. This resolves confusion when null is used in a key, and at the same time, maintains proper collating sequence. The following rules apply to character representation:

All codes except 00 and 01 represent the corresponding ASCII value.

00 is a terminator.

01 is an indicator to translate the next code using the following:

Code

Means

ASCII

01

00

<NUL>

02

01

<SOH>

With UTF-8 character-set specified, the interpreted output displays a dot character for all graphic characters and malformed characters. For example, the internal representation of the global variable ^DS=$CHAR($$FUNC^%HD("0905"))_$ZCHAR(192) looks like the following:

Rec:1  Blk 3  Off 10  Size C  Cmpc 0  Key ^DS 
      10 : |  C  0  0  0 44 53  0  0 E0 A4 85 C0                        | 
           |  .  .  .  .  D  S  .  .        ?  .                        | 

Note that DSE displays the wellformed character ? for $CHAR($$FUNC^%HD("0905")) and a dot character for malformed character $ZCHAR(192).

With M character-set specified, the interpreted output displays a dot character for all non-ASCII characters and malformed characters.

Numeric subscripts have the format:

[ sign bit ] [ biased exponent ] [ normalized mantissa ] 

The sign bit and biased exponent together form the first byte of the numeric subscript. Bit seven (7) is the sign bit. Bits <6:0> comprise the exponent. The remaining bytes preceding the subscript terminator of one null (ASCII 0) byte represent the variable length mantissa. The following description shows a way of understanding how GT.M converts each numeric subscript type to its internal format:

Zero (0) subscript (special case)

Mantissa

Exponent

The resulting exponent falls in the hexadecimal range 3F to 7D if positive, and zero (0) to 3E if negative.

Sign

For example, the interpreted representation of the global ^NAME(.12,0,"STR",-34.56) looks like the following:

Rec:1  Blk 5  Off 10  Size 1A  Cmpc 0  Key ^NAME(.12,0,"STR",-34.56) 
      10 : | 1A  0  0 61 4E 41 4D 45  0 BE 13  0 80  0 FF 53 54 52  0 3F| 
           |  .  .  .  a  N  A  M  E  .  .  .  .  .  .  .  S  T  R  .  ?| 
      24 : | CA A8 FF  0  0 31                                          | 
           |  .  .  .  .  .  1                                          | 

Note that CA A8 ones complement representation is 35 57 and then when you subtract one (1) from each byte in the mantissa you get 34 56.

Similarly, the interpreted representation of ^NAME(.12,0,"STR",-34.567) looks like the following:

Rec:1  Blk 5  Off 10  Size 1B  Cmpc 0  Key ^NAME(.12,0,"STR",-34.567) 
      10 : | 1B  0  0  9 4E 41 4D 45  0 BE 13  0 80  0 FF 53 54 52  0 3F| 
           |  .  .  .  .  N  A  M  E  .  .  .  .  .  .  .  S  T  R  .  ?| 
      24 : | CA A8 8E FF  0  0 32                                       | 
           |  .  .  .  .  .  .  2                                       | 

Note that since there are odd number of digits, GT.M appends zero (0) to mantissa and one (1) to each byte in mantissa.

loading table of contents...