These are some notes on the internals of RNetica which hopefully will
be of use to anybody trying to write extensions.

1.  R handles for Netica objects.

The key netica functions return pointers for two kinds of Netica
objects:  nets (net_bn) and nodes (node_bn).  Both nets and nodes
support a UserData field which is a (void *) pointer.  The basic idea
of the system is that when a netica net or node is created, a corresponding
R object (class NeticaBN or NeticaNode) is created and installed in
the User Data slot.  The R object has a pointer to the node installed
in one of its attributes (using R_ExternalPtrAddr()) and the Netica
has the R object installed in its UserData field (SEXPs are pointers,
so this is straightforward).  The function is.active() tests for a null
pointer, which indicates that the corresponding Netica object is not
present.

UPDATE (This text refers to version of RNetica prior to 0.5, tagged as
RNeticaS3 in svn):

NeticaNode and NeticaBN objects are preserved (R_PreserveObject) when
created and released (R_ReleaseObject) when the corresponding Netica
object is deleted.  Generally speaking these objects don't need to be
protected (as the are already on the precious list).  

The equal functions for NeticaBN and NeticaNode objects tests the
pointers (if the object is active) and only returns true if the
pointers are the same.

Internally, both NeticaBN and NeticaNode objects are created by taking
the name of the net or node, attaching the class, the handle to the
Netica object, and other class specific data.  This means that
as.character(net) or as.character(node) will usually return the name
of the net or node.  

When nodes and nets are renamed, RNetica returns a modified NeticaNode
or NeticaBN object that is based around the new name.  However, due to
R's copy on modify policy, R insists on copying rather than modifying
the handle, so stale copies of the objects could exist for which
as.character(node) != NodeName(node).  This can be fixed with
NodeName(node) <- NodeName(node) (and similar for networks).

Note tht when Netica is unloaded (or R is shut down) all pointers will
become Null and need to be reestablished.

UPDATE (RNetica 0.5 and beyond):
Starting with Version 0.5 Netica objects are now associated with R6
reference classes.  Each has an external pointer field and other
appropriate fields.  For the most part the fields of these objects are
self-explanatory.  The constructor for these (particularly, the node
and network objects) are called from within the C code as needed.

[Note that this required some hacks on my part to access fields and
constructors for R6 objects from C code as well as the fields.  The
functions RX... access the fields.
Hopefully, these will be stable across R versions.

extern SEXP RX_do_RC_field(SEXP obj, SEXP name);
extern SEXP RX_do_RC_field_assign(SEXP obj, SEXP name, SEXP value);
extern int RX_has_RC_field(SEXP obj, SEXP name);
#define GET_FIELD(x, what)       RX_do_RC_field(x, what)
#define SET_FIELD(x, what, value)  RX_do_RC_field_assign(x, what, value)
#define HAS_FIELD(x, what)       RX_has_RC_field(x, what)
]

This should cure a problem that was seen with version 0.4, where
various R functions (particularly, c()) stripped the attributes from
the strings resulting in a string where we expected a node or a
network.

Another difference is that I am no longer relying on the Netica user
data to store the back pointers to the R objects.  There seemed to be
a problem where the back pointers where pointing to the wrong symbol
in the R workspace.  So I've gone to a different rule which should
work better.  Netica Nets are now registered as symbols (corresponding
to the Netica name) in the Netica Session object in a special
environment stored in the field $nets.  Netica Nodes are now
registered as symbols (corresponding to the node name) in a special
environment stored in the field $nodes.  When a Netica function
returns a pointer to a Netica object (particularly, a network or a
node) then RNetica first searches for the network or node by name in
the enclosing environment (Net or Session).  If found, then the
pointers are checked to make sure they are the same and that object is
returned.  If it is not found, a new R object of the appropriate type
is created.

This means that Nodes need to contain pointers to the enclossing
session and nets have pointers to the enclosing session.  (Thus, if
necessary, one can get to node$Net$Session.  Case Streams and RNGs
also have pointers to the session object.)  The biggest issue is
probably when trying to look at the NodeNet() function.  Here it needs
to use the link node$Net$Session to find the Netica identified network
in the session, which could potentially be a problem if there is
pointer corruption.  The function NodeNet() if given the option
internal=TRUE should check the pointer against each other.

Similiarly, the names of the objects are stored in both the R6 object
and internally to the Netica object.  Using the internal=TRUE option
to the NodeName or NetworkName function, checks for this kind of
corruption.

It is still true that when Netica is unloaded (stopSession()) or R is
shut down all pointers will become Null and need to be reestablished. 

1a. Session

When the Netica shared object is launched it returns a pointer of type
environ_ns which is the link to the Netica session.  Prior to RNetica
version 0.5, this was stored in a global C variable.  Starting with
RNetica version 0.5, it is now stored in an object of type
NeticaSession.  The pointer is a feild of this object.  In C code it
can be accessed with:

extern environ_ns* GetSessionPtr(SEXP sessobj);

Note that the Session is active precisely when this pointer is not
null.

The functions startSession() and stopSession() start and stop the
session.  Note that when a session is stopped, all network, node, case
stream and RNG objects are also deactivated.

The NeticaSession object now stores the LicenseKey.  If this is
present, it is used when starting the session.  This determines
whether the session is licence (key is valid) or unlicensed (key
missing or not valid).

A collection of all networks which are currently open is
stored in the $nets field of the session object.  This automatically
happens when a new network is created through CreateNetwork() or
ReadNetworks().
The functions:
void RN_RegisterNetwork(SEXP sessobj, const char* netname, SEXP
netobj)
void RN_UnregisterNetwork(SEXP sessobj, const char* netname)
SEXP RN_FindNetworkStr(SEXP sessobj, const char* netname) 
are used to handle this in the C code.  Note that removing a network
must be done by calling "rm" on the $nets field inside of R code.

The function NeticaSession() creates new Netica Session object.

Prior to Netica version 0.5, the Netica sesion was created by the
function StartNetica(), which opened the session and stored the
pointer in the internal C object.  This was called when the package
was loaded, so the user did not need to worry about that.

This functionality has been replaced with the function
getDefaultSession(), which is the default for any function which
requires a sesison argument.
This function performs the following steps:
1) It looks for a binding of the variable DefaultNeticaSession in the
.GlobalEnv.
2) If that is not found, it prompt the users to create one (note this
will fail unless R is running in interactive mode).
3) If it is creating a sesison, it will look for a binding of the
variable NeticaLicenseKey in the .GlobalEnv.  If this is found, it
will be used as the license key.
4) If the DefaultNeticaSession is not open, it will call
StartSession() on it.

Session objects are also responsible for error reporting.  In
particular, the methods $reportErrors() and $clearErrors() should be
applied to the session object.  For any function that interacts with
the Netica API, there is an argument which is a Session, Network,
Node, CaseStream or RNG.  Each of these has a $Session field (or in the
case of the node a $Net$Session field) which points back to the
session and can be used for error reporting.

When a session is closed (deactivated, stopped) it does the following
housekeeping:
1) All contained networks (in its cache) are deactivated, and they
recursively deactivate any cached nodes. 
2) Any open RNG or CaseStream objects stored in a weak reference array
are closed (deactivated).

1a.  Networks

In addition to the other field, NeticaBN objects have a
"PathanameName" field (or "Filename" attribute in RNetica 0.4 and
lower), which stores the pathname of the file most 
recently used to read/write the network.  This exists in the R object
and is maintained separately from the pathname stored in the Netica
object.  The purpose is to facilitate restoring the handle to the
network when R is restarted.   In particular, net <- ReadNetworks(net)
should reload the network and restore the pointer (at least that
instance of it).

The following macros are useful manipulating the relationship between
NeticaBN and net_bn objects.
GetSessionPtr(b)        Returns net_bn* for NeticaBN SEXP
GetNeticaHandle(b)      Returns net_bn* for NeticaBN SEXP (for
                        backward compatability)
BN_NAME(b)              Returns cached name of NeticaBN object
                        (primarily used in reporting errors).

Networks also maintain a cache of nodes in the environment $nodes.
Note that nodes are not automatically added to this environment, they
are only added as they are referenced by user calls.  This primarily
affects networks which have been read in from a file.  The function
NetworkAllNodes() forces all nodes into the cache.

The functions:

void RN_RegisterNode(SEXP netobj, const char* nodename, SEXP nodeobj)
void RN_UnregisterNode(SEXP netobj, const char* nodename)
SEXP RN_FindNodeStr(SEXP netobj, const char* nodename) {

maintain the node cache from the C side.

1b. Netica Nodes

The Netica API seems to make a big distinction between discrete and
continuous nodes:  I can't seem to find an API function which will
change the type of the node.  The "node_discrete" attribute is used to
cache information about the node type, so that we don't need to call
into C code to find that out.

The protocol for creation of NeticaNode objects is a little bit
different.  RNetica normally creates a NeticaNode object when the node
is referenced in a function call.  However, when a network is read
from a file, RNetica does not create NeticaNode objects until a link
to that node is created, say via a call to NetworkFindNode() or
NetworkAllNodes().  The function MakeNode_RRef() does the actual
work of creating the object and the function GetNode_RRef() creates
the NeticaNode object if necessary.


OBSOLETE (Version 0.4 and prior) The function RN_FreeNode() and
RN_FreeNodes() are used to clear the NeticaNode object associated with
a node_bn*.  The latter function is called by RN_DeleteNetworks() to
free the NeticaNodes associated with the network.

UPDATE:  The function RN_DeactivateBN is now called. When closing the
network.  The iteration is done in the R code for the various objects.

The following macros are useful manipulating the relationship between
NeticaNode and node_bn objects.
GetNodePtr(b)           Returns node_bn* for NeticaNode SEXP
GetNodeHandle(b)        Returns node_bn* for NeticaBode SEXP
NODE_NAME(b)            Returns cached name of NeticaNODE object
                        (primarily used in reporting errors).


3. API references and License Keys

This is primarily still a TODO.  Right now the path to the Netica API
headers and .a file is hard coded into the configuration files.  There
is a command line switch in the autoconf, but I haven't tested it.

It would be kool if RNetica downloaded the latest Netica API at
compile time (if the switch was not set).  I will probably need to ask
Brent Borlange to set up a special link so I can always get the most
recent API, without needing to update the package every time Norsys
releases a new version.

License keys are obtained from Norsys.  In Version 0.4 and prior they were
passed as an argument into StartNetica.  StartNetica() would be called
when RNetica was attached and it would look for a special variable
"NeticaLicenseKey" is set in the global environment. 

Currently, they are set as a field of the session object.  To simulate
the prior behavior, the function getDefaultSession() looks for a
symbol "DefaultNeticaSession" in the global environment.  If that does not
exist, it tries to create it using the value of "NeticaLicenseKey" if
that exists.

TODO:  Look at the R license mechanism to make sure everybody is clear
on the RNetica vs Netica license as we are on the properitary/open
source line.

4. Error Handling

The most obvious way to handle errors seems to be to simply report all
the errors and then clear them.  Version 0.4 and below used the
function ReportErrors() and ClearAllErrors() to handle the errors.
These are now methods of the NeticaSession class (the functions still
exist and call the method on the DefaultNeticaSession.  The
reporting function clears the errors, so there is no need to do
anything else. 

In the current design, the error handling does not take place in the
main .Call(), but rather the calling R function calls ReportErrors()
and tests its return values to see if there were serious errors or
not.  It is possible that more sophisticated error mechanisms are
possible, but I haven't seen the need.  Update.  In version 0.5 and
beyond, all functions which interface with the Netica API take an
argument which is a session, a network or a node (with a couple of
execptions which take CaseStream or NeticaRNG arguments).  In these
cases the $Session field is used for nets, streams and rngs and
$Net$Session for nodes.

5. Node Lists and Node Sets

Node lists are collections of nodes used for various purposes.  There
is no need for a special object on the R side, as existing lists work
perfectly.  The one catch is that all nodes in a node list must be
from the same network.  Failing this criteria will likely generate an
error deep inside the RNetica C code.

The following C functions may be useful:

extern SEXP RN_AS_RLIST(const nodelist_bn* nodelist, SEXP bn);
       Converts a nodelist_bn* to a R vector
extern nodelist_bn* RN_AS_NODELIST(SEXP nodes, net_bn* net);
       Converts a R vector to a nodelist_bn*.  It is an error if not
       all nodes are associated with net.

UPDATE Version 0.5: the SEXP bn argument was added to RN_AS_RLIST.


6. Registration

The Registration of C functions is done in the file Registration.c

The functions RN_Define_Symbols() and RN_FreeSymbols() are used to
define a certain number of R constants that can be used over and over
again (for example, the attributes used for storing handles, and the
class names for NeticaBN and NeticaNode classes.  The objects are
preserved using R_PreserveObject() and R_ReleaseObject(), so don't
need to be protected.  These functions are called by StartNetica() and
StopNetica() respectively.

With both C code and R6 objects the interactions between what happens
when the namespace is loaded and attached is a bit of tricky timing.
In particular, the C code for creating external pointers of the proper
type is not loaded when the prototype for the R6 classes is loaded.
On the other hand, the function externalptr() does not seem to
reliably produce a null pointer, so C code is needed in the
$initialize() function to get around that.  The work around is to set
a variable CCcodeLoaded which switches between using externalptr()
(CCcodeLoaded=FALSE) and using the C code (CCcodeLoaded=TRUE)

The following are what happens during the load cycle:

onload:  Set CCodeLoaded to FALSE to avoid race conditions when
         generating prototype objects.
onAttach: Call RN_Define_Symbols to make sure reused symbols are defined.
         Set CCocdLoaded to TRUE (use C code to avoid null
         pointers).
         Set EV_STATE from the results of RN_GetEveryState to get the
         Netica value for this constant.
onDetach:  Currently nothing
onUnload:  Unload the dynamic library.


Further questions:  Email telling me how clever I am can be sent to
almond@acm.org.  Questions can be sent to the same place, but no
guarentees on the response time.  Complaints can be sent to /dev/null.