© Copyright IBM Corporation 2009
ALL RIGHTS RESERVED

Using BE with the Really Small Message Broker

When Really Small Message Broker hits a problem, it writes a file that contains the full state of the system. This is a binary data file that requires special decoding in order to extract information from it. This decoding is done with a tool called BE - Andy's Binary Folding Editor.

BE is a freely available tool that takes an ini file which describes the structure of the data to parse and then provides an interactive console to view and explore it.

BE Dependencies

The following things are needed to run BE:

rsmb.ini

The ini file is built from the .h files of RSMB. Wherever a data structure is defined in the C code, a matching RSMB definition is also provided. It is important that they are kept in sync - any changes to one must be reflected in the other. A basic guide for doing this will be written shortly.

To generate the ini file, a perl script is provided to pull out the BE definitions from the .h files. It is located in rsmb/src/tools/be/be.pl. The easiest way to run it is:

cd /src/dir/containing/broker.h
perl tools/be/be.pl

By default, the tool will look for broker.h in the current directory and create rsmb.ini there as well. This can be changed, for example:

perl be.pl -s /src/dir/containing/ -o /tmp/rsmb.ini

bersmb.so

To be updated for Windows

The source for the shared library is located in rsmb/src/tools/be/bersmb.c and rsmb/src/tools/be/bememext.h. The header file is the standard BE one. bersmb.c is the RSMB specific code. A Makefile is also provided, so compiling the extension is as simple as invoking make.

Once built, bersmb.so needs to be put somewhere where BE can find it. On linux, BE locates shared libraries by looking along the LD_LIBRARY_PATH, directories listed in the /etc/ld.so.cache file, and in the /usr/lib and /lib directories.

Generating a heapdump

Currently, a heapdump can be triggered on linux by sending a SIGUSER1 signal to the rsmb process:

$ pgrep rsmb
12345
$ kill -10 12345

This will change in the near future.

Running BE

Assuming all the necessary files are in the current directory, one method of running the tool is:

LD_LIBRARY_PATH=. be -i rsmb.ini rsmb\!rsmb.heapdump.20091124.213117.363.dmp

Navigating in BE

BE is very powerful and can be slightly overwhelming at first. Here is a very brief run down of how to navigate to help you get started:

Linked Lists

BE is capable of recognising a Linked List structure and displaying all of the entries rather than individual ones. For example, to look at the log entries:

  1. From the start screen, select the Log Buffer structure and press Enter.
    Broker States                 : BROKERSTATES
    Log Buffer                    : 0x0865f090->{14}
    Trace Buffer                  : 0x0865f038->{2063}
    Config file                   : 0x08056e4a->{"broker.cfg" z}
    sockets                       : SOCKETS
    
  2. This displays the LinkedList structure.
    first   : ( 0x0865f210->STRINGItem )
    last    : ( 0x0866f030->STRINGItem )
    current : ( 0x00000000->STRINGItem )
    count   : 14
    size    : ( 0x0000053e )
    
    Select the 'first' entry and press Enter.
  3. This displays the first element in the list.
    next   : ( 0x0865f280->STRINGItem )
    prev   : ( 0x00000000->STRINGItem )
    STRING : 0x0865f188->STRING
    
    To see the actual entry, you could select the STRING entry and press Enter, or you could press + on the entry to expand its detail.
    next   : ( 0x0865f280->STRINGItem )
    prev   : ( 0x00000000->STRINGItem )
    STRING : 0x0865f188->{"(0000) 20091124 213105.329 CWNAN9999I Really Small Message Broker " z}
    
  4. To see the next entry, you could select the next entry and press Enter but that would soon get boring when there are more than a tiny number of entries. This is where BE's linked list awareness comes in. Select the next entry and press Alt-l. BE then displays all of the entries in the list in one go:
    [0x000] : STRINGItem
    [0x001] : STRINGItem
    [0x002] : STRINGItem
    [0x003] : STRINGItem
    [0x004] : STRINGItem
    [0x005] : STRINGItem
    [0x006] : STRINGItem
    [0x007] : STRINGItem
    [0x008] : STRINGItem
    [0x009] : STRINGItem
    [0x00a] : STRINGItem
    [0x00b] : STRINGItem
    [0x00c] : STRINGItem
    [0x00d] : STRINGItem
    ... can't follow null pointer at 0x0866f034
    
    As before, pressing + will expand the detail of the entries. This time, you have to press + twice to get to the right level of detail:
    [0x000] : {0x0865f188->{"(0000) 20091124 213105.329 CWNAN9999I Really Small Message Broker " z}}
    [0x001] : {0x0865f7c0->{"(0001) 20091124 213105.329 CWNAN9998I Optional features included: bridge" z}}
    [0x002] : {0x0865f860->{"(0002) 20091124 213105.329 CWNAN9997I Licensed Materials - Property of IBM" z}}
    [0x003] : {0x0865f938->{"(0003) 20091124 213105.329 CWNAN9996I (C) Copyright IBM Corp. 2007, 2009 All Rights Reserved" z}}
    [0x004] : {0x0865fa28->{"(0004) 20091124 213105.329 CWNAN9995I US Government Users Restricted Rights - Use, duplication or di"}}
    [0x005] : {0x0865fb58->{"(0005) 20091124 213105.329 CWNAN9994I Version 1.2.0, Fri Oct 2 17:45:25 2009" z}}
    [0x006] : {0x0865fc38->{"(0006) 20091124 213105.329 CWNAN9993I Author: Ian Craggs (icraggs@uk.ibm.com)" z}}
    [0x007] : {0x08663df8->{"(0018) 20091124 213105.329 CWNAN0006I Adding value "127.0.0.1:1885" to list "address"" z}}
    [0x008] : {0x08663ee0->{"(0019) 20091124 213105.329 CWNAN0008W Unrecognized configuration keyword protocol on line no 10" z}}
    [0x009] : {0x08667da0->{"(0052) 20091124 213105.330 CWNAN0014I MQTT protocol starting, listening on port 1884" z}}
    [0x00a] : {0x0866a1c0->{"(0024) 20091124 213106.327 CWNAN0124I Starting bridge connection foo" z}}
    [0x00b] : {0x0866bcc8->{"(0030) 20091124 213106.328 CWNAN0130E Connect for bridge client origami.foo, address 127.0.0.1:1885,"}}
    [0x00c] : {0x0866d7a0->{"(0012) 20091124 213106.329 CWNAN0099I Retrying bridge connection foo, address 127.0.0.1:1885 without"}}
    [0x00d] : {0x0866ef68->{"(0043) 20091124 213106.329 CWNAN0130E Connect for bridge client origami.foo, address 127.0.0.1:1885,"}}
    ... can't follow null pointer at 0x0866f034
    
  5. You could at this point press p to write this information to a file.

BE Definitions

It is vital that the BE definitions in the code are maintained along-side the structures they represent. To help understand how they two relate, here is an example from Users.h.

Here is a set of C structures used to represent a User. It includes points to a username, a password and a list of Rule types that contain a topic and a permission value.

#define ACL_FULL 0
#define ACL_WRITE 1
#define ACL_READ 2 

typedef struct
{
   char* topic;
   int permission;
} Rule;

typedef struct
{
   char* username;
   char* password;
   List* acl;
} User;

To begin writing the BE definition for this we use a BE comment, which is what the perl script looks for when extracting the definitions:

/*BE
...
BE*/

As the User struct includes a List type, we need to ensure all of the relevant definitions have been included from LinkedList.h. This is done with an include statement. Again, this is used by the perl script to ensure the .h files are parsed in the right order; BE requires things to be defined before they are used.

include "LinkedList"

Next we define a map to turn the permission values to human-readable text. This is equivalent to an enum in C.

map permission
{
   "full" .
   "write" .
   "read"  .
}

With that in place, we can define the Rule struct.

def RULE
{
   n32 ptr STRING open "topic"
   n32 map permission "permission"
}

You can see that each entry in the C Rule struct maps to a BE one. Here's a break down of the first line:

The second line follow much the same pattern. This time however, it is a 32-bit value that can be decoded using the permission map.

Having defined RULE, we know that the User struct includes a pointer to a List of Rules. So that BE can properly decode the list, we need to define a type for "list-of-RULEs". There is a BE macro available that does this for us, in this instance it will define a type called RULEList that can be used later on.

defList(RULE)

Finally we can define the User struct.

def USER
{
   n32 ptr STRING open "username"
   n32 ptr STRING open "password"
   n32 ptr RULEList open "acl"
}
defList(USER)

Here you can see the acl entry is defined as a 32-bit pointer to a RULEList type. Also note the defList macro is used to defined USERList - as we know that other data structures include lists of Users. It is good practice to define the List type immediately after the original definition and also that the List type should only be defined for types that are known to be used in a list.

BE also supports equivalents of #ifdef. For example, from Bridge.h:

def BRIDGETOPICS
{
   n32 ptr STRING open "pattern"
   n32 ptr STRING open "localPrefix"
   n32 ptr STRING open "remotePrefix"
   n32 map BRIDGE_TOPIC_DIRECTION "direction"
$ifdef MQTTS
   n32 map bool "subscribed"
$endif
}

This allows the one rsmb.ini to be used against RSMB regardless of what options where includes at compile time. That said, when running BE, it is vital to specify the same -D options as were used to compile the binary. If not, the structure definitions will not match those in the heap dump.