42 KiB
Jaccwabyt 🐇
Jaccwabyt: JavaScript ⇄ C Struct Communication via WASM Byte Arrays
Welcome to Jaccwabyt, a JavaScript API which creates bindings for WASM-compiled C structs, defining them in such a way that changes to their state in JS are visible in C/WASM, and vice versa, permitting two-way interchange of struct state with very little user-side friction.
(If that means nothing to you, neither will the rest of this page!)
Browser compatibility: this library requires a recent browser and makes no attempt whatsoever to accommodate "older" or lesser-capable ones, where "recent," very roughly, means released in mid-2018 or later, with late 2021 releases required for some optional features in some browsers (e.g. [BigInt64Array][] in Safari). It also relies on a couple non-standard, but widespread, features, namely [TextEncoder][] and [TextDecoder][]. It is developed primarily on Firefox and Chrome on Linux and all claims of Safari compatibility are based solely on feature compatibility tables provided at [MDN][].
Formalities:
- Author: Stephan Beal
- Project Homes:
- https://fossil.wanderinghorse.net/r/jaccwabyt\
Is the primary home but... - https://sqlite.org/src/dir/ext/wasm/jaccwabyt\
... most development happens here.
- https://fossil.wanderinghorse.net/r/jaccwabyt\
The license for both this documentation and the software it documents is the same as sqlite3, the project from which this spinoff project was spawned:
2022-06-30:
The author disclaims copyright to this source code. In place of a legal notice, here is a blessing:
May you do good and not evil. May you find forgiveness for yourself and forgive others. May you share freely, never taking more than you give.
Table of Contents
- Overview
- Creating and Binding Structs
- APIs
- Appendices
Overview
Management summary: this JavaScript-only framework provides limited two-way bindings between C structs and JavaScript objects, such that changes to the struct in one environment are visible in the other.
Details...
It works by creating JavaScript proxies for C structs. Reads and writes of the JS-side members are marshaled through a flat byte array allocated from the WASM heap. As that heap is shared with the C-side code, and the memory block is written using the same approach C does, that byte array can be used to access and manipulate a given struct instance from both JS and C.
Motivating use case: this API was initially developed as an experiment to determine whether it would be feasible to implement, completely in JS, custom "VFS" and "virtual table" objects for the WASM build of sqlite3. Doing so was going to require some form of two-way binding of several structs. Once the proof of concept was demonstrated, a rabbit hole appeared and down we went... It has since grown beyond its humble proof-of-concept origins and is believed to be a useful (or at least interesting) tool for mixed JS/C applications.
Portability notes:
- These docs sometimes use Emscripten as a point of reference because it is the most widespread WASM toolchain, but this code is specifically designed to be usable in arbitrary WASM environments. It abstracts away a few Emscripten-specific features into configurable options. Similarly, the build tree requires Emscripten but Jaccwabyt does not have any hard Emscripten dependencies.
- This code is encapsulated into a single JavaScript function. It should be trivial to copy/paste into arbitrary WASM/JS-using projects.
- The source tree includes C code, but only for testing and demonstration purposes. It is not part of the core distributable.
Architecture
BSBF: box rad 0.3*boxht "StructBinderFactory" fit fill lightblue
BSB: box same "StructBinder" fit at 0.75 e of 0.7 s of BSBF.c
BST: box same "StructType<T>" fit at 1.5 e of BSBF
BSC: box same "Struct<T>" "Ctor" fit at 1.5 s of BST
BSI: box same "Struct<T>" "Instances" fit at 1 right of BSB.e
BC: box same at 0.25 right of 1.6 e of BST "C Structs" fit fill lightgrey
arrow -> from BSBF.s to BSB.w "Generates" aligned above
arrow -> from BSB.n to BST.sw "Contains" aligned above
arrow -> from BSB.s to BSC.nw "Generates" aligned below
arrow -> from BSC.ne to BSI.s "Constructs" aligned below
arrow <- from BST.se to BSI.n "Inherits" aligned above
arrow <-> from BSI.e to BC.s dotted "Shared" aligned above "Memory" aligned below
arrow -> from BST.e to BC.w dotted "Mirrors Struct" aligned above "Model From" aligned below
arrow -> from BST.s to BSC.n "Prototype of" aligned above
Its major classes and functions are:
- StructBinderFactory is a factory function which accepts a configuration object to customize it for a given WASM environment. A client will typically call this only one time, with an appropriate configuration, to generate a single...
- StructBinder is a factory function which converts an arbitrary number struct descriptions into...
- StructTypes are constructors, one per struct
description, which inherit from
StructBinder.StructType
and are used to instantiate... - Struct instances are objects representing individual instances of generated struct types.
An app may have any number of StructBinders, but will typically need only one. Each StructBinder is effectively a separate namespace for struct creation.
Creating and Binding Structs
From the amount of documentation provided, it may seem that creating and using struct bindings is a daunting task, but it essentially boils down to:
- Confire Jaccwabyt for your WASM environment. This is a one-time task per project and results is a factory function which can create new struct bindings.
- Create a JSON-format description of your C structs. This is required once for each struct and required updating if the C structs change.
- Feed (2) to the function generated by (1) to create JS constuctor functions for each struct. This is done at runtime, as opposed to during a build-process step, and can be set up in such a way that it does not require any maintenace after its initial setup.
- Create and use instances of those structs.
Detailed instructions for each of those steps follows...
Step 1: Configure Jaccwabyt for the Environment
Jaccwabyt's highest-level API is a single function. It creates a factory for processing struct descriptions, but does not process any descriptions itself. This level of abstraction exist primarily so that the struct-specific factories can be configured for a given WASM environment. Its usage looks like:
const MyBinder = StructBinderFactory({
// These config options are all required:
heap: WebAssembly.Memory instance or a function which returns
a Uint8Array or Int8Array view of the WASM memory,
alloc: function(howMuchMemory){...},
dealloc: function(pointerToFree){...}
});
It also offers a number of other settings, but all are optional except for the ones shown above. Those three config options abstract away details which are specific to a given WASM environment. They provide the WASM "heap" memory (a byte array), the memory allocator, and the deallocator. In a conventional Emscripten setup, that config might simply look like:
{
heap: Module['asm']['memory'],
//Or:
// heap: ()=>Module['HEAP8'],
alloc: (n)=>Module['_malloc'](n),
dealloc: (m)=>Module['_free'](m)
}
The StructBinder factory function returns a function which can then be used to create bindings for our structs.
Step 2: Create a Struct Description
The primary input for this framework is a JSON-compatible construct which describes a struct we want to bind. For example, given this C struct:
// C-side:
struct Foo {
int member1;
void * member2;
int64_t member3;
};
Its JSON description looks like:
{
"name": "Foo",
"sizeof": 16,
"members": {
"member1": {"offset": 0,"sizeof": 4,"signature": "i"},
"member2": {"offset": 4,"sizeof": 4,"signature": "p"},
"member3": {"offset": 8,"sizeof": 8,"signature": "j"}
}
}
These data must match up with the C-side definition of the struct (if any). See Appendix G for one way to easily generate these from C code.
Each entry in the members
object maps the member's name to
its low-level layout:
offset
: the byte offset from the start of the struct, as reported by C'soffsetof()
feature.sizeof
: as reported by C'ssizeof()
.signature
: described below.readOnly
: optional. If set to true, the binding layer will throw if JS code tries to set that property.
The order of the members
entries is not important: their memory
layout is determined by their offset
and sizeof
members. The
name
property is technically optional, but one of the steps in the
binding process requires that either it be passed an explicit name or
there be one in the struct description. The names of the members
entries need not match their C counterparts. Project conventions may
call for giving them different names in the JS side and the
StructBinderFactory can be configured to automatically add a
prefix and/or suffix to their names.
Nested structs are as-yet unsupported by this tool.
Struct member "signatures" describe the data types of the members and
are an extended variant of the format used by Emscripten's
addFunction()
. A signature for a non-function-pointer member, or
function pointer member which is to be modelled as an opaque pointer,
is a single letter. A signature for a function pointer may also be
modelled as a series of letters describing the call signature. The
supported letters are:
v
=void
(only used as return type for function pointer members)i
=int32
(4 bytes)j
=int64
(8 bytes) is only really usable if this code is built with BigInt support (e.g. using the Emscripten-sWASM_BIGINT
build flag). Without that, this API may throw when encountering thej
signature entry.f
=float
(4 bytes)d
=double
(8 bytes)c
=int8
(1 byte) char - see notes below!C
=uint8
(1 byte) unsigned char - see notes below!p
=int32
(see notes below!)P
= Likep
but with extra handling. Described below.s
= likeint32
but is a hint that it's a pointer to a string so that some (very limited) contexts may treat it as such, noting that such algorithms must, for lack of information to the contrary, assume both that the encoding is UTF-8 and that the pointer's member is NUL-terminated. If that is not the case for a given string member, do not uses
: usei
orp
instead and do any string handling yourself.
Noting that:
- All of these types are numeric. Attempting to set any struct-bound property to a non-numeric value will trigger an exception except in cases explicitly noted otherwise.
- "Char" types: WASM does not define an
int8
type, nor does it distinguish between signed and unsigned. This API treatsc
asint8
andC
asuint8
for purposes of getting and setting values when using theDataView
class. It is not recommended that client code use these types in new WASM-capable code, but they were added for the sake of binding some immutable legacy code to WASM.
Sidebar: Emscripten's public docs do not mention
p
, but their generated code includesp
as an alias fori
, presumably to mean "pointer". Thoughi
is legal for pointer types in the signature,p
is more descriptive, so this framework encourages the use ofp
for pointer-type members. Usingp
for pointers also helps future-proof the signatures against the eventuality that WASM eventually supports 64-bit pointers. Note that sometimesp
really means pointer-to-pointer, but the Emscripten JS/WASM glue does not offer that level of expressiveness in these signatures. We simply have to be aware of when we need to deal with pointers and pointers-to-pointers in JS code.
Trivia: this API treates
p
as distinctly different fromi
in some contexts, so its use is encouraged for pointer types.
Signatures in the form x(...)
denote function-pointer members and
x
denotes non-function members. Functions with no arguments use the
form x()
. For function-type signatures, the strings are formulated
such that they can be passed to Emscripten's addFunction()
after
stripping out the (
and )
characters. For good measure, to match
the public Emscripten docs, p
, c
, and C
, should also be replaced
with i
. In JavaScript that might look like:
signature.replace(/[^vipPsjfdcC]/g,'').replace(/[pPscC]/g,'i');
P
vs p
in Method Signatures
This support is experimental and subject to change.
The method signature letter p
means "pointer," which, in WASM, means
"integer." p
is treated as an integer for most contexts, while still
also being a separate type (analog to how pointers in C are just a
special use of unsigned numbers). A capital P
changes the semantics
of plain member pointers (but not, as of this writing, function
pointer members) as follows:
- When a
P
-type member is set viamyStruct.x=y
, if(y instanceof StructType)
then the value ofy.pointer
is stored inmyStruct.x
. Ify
is neither a number nor a StructType, an exception is triggered (regardless of whetherp
orP
is used).
Step 3: Binding the Struct
We can now use the results of steps 1 and 2:
const MyStruct = MyBinder(myStructDescription);
That creates a new constructor function, MyStruct
, which can be used
to instantiate new instances. The binder will throw if it encounters
any problems.
That's all there is to it.
Sidebar: that function may modify the struct description object and/or its sub-objects, or may even replace sub-objects, in order to simplify certain later operations. If that is not desired, then feed it a copy of the original, e.g. by passing it
JSON.parse(JSON.stringify(structDefinition))
.
Step 4: Creating, Using, and Destroying Struct Instances
Now that we have our constructor...
const my = new MyStruct();
It is important to understand that creating a new instance allocates
memory on the WASM heap. We must not simply rely on garbage collection
to clean up the instances because doing so will not free up the WASM
heap memory. The correct way to free up that memory is to use the
object's dispose()
method.
The following usage pattern offers one way to easily ensure proper cleanup of struct instances:
const my = new MyStruct();
try {
console.log(my.member1, my.member2, my.member3);
my.member1 = 12;
assert(12 === my.member1);
/* ^^^ it may seem silly to test that, but recall that assigning that
property encodes the value into a byte array in heap memory, not
a normal JS property. Similarly, fetching the property decodes it
from the byte array. */
// Pass the struct to C code which takes a MyStruct pointer:
aCFunction( my.pointer );
} finally {
my.dispose();
}
Sidebar: the
finally
block will be run no matter how thetry
exits, whether it runs to completion, propagates an exception, or uses flow-control keywords likereturn
orbreak
. It is perfectly legal to usetry
/finally
without acatch
, and doing so is an ideal match for the memory management requirements of Jaccwaby-bound struct instances.
It is often useful to wrap an existing instance of a C-side struct without taking over ownership of its memory. That can be achieved by simply passing a pointer to the constructor. For example:
const m = new MyStruct( functionReturningASharedPtr() );
// calling m.dispose() will _not_ free the wrapped C-side instance
// but will trigger any ondispose handler.
Now that we have struct instances, there are a number of things we can do with them, as covered in the rest of this document.
API Reference
API: Binder Factory
This is the top-most function of the API, from which all other functions and types are generated. The binder factory's signature is:
Function StructBinderFactory(object configOptions);
It returns a function which these docs refer to as a StructBinder (covered in the next section). It throws on error.
The binder factory supports the following options in its configuration object argument:
-
heap
Must be either aWebAssembly.Memory
instance representing the WASM heap memory OR a function which returns an Int8Array or Uint8Array view of the WASM heap. In the latter case the function should, if appropriate for the environment, account for the heap being able to grow. Jaccwabyt uses this property in such a way that it "should" be okay for the WASM heap to grow at runtime (that case is, however, untested). -
alloc
Must be a function semantically compatible with Emscripten'sModule._malloc()
. That is, it is passed the number of bytes to allocate and it returns a pointer. On allocation failure it may either return 0 or throw an exception. This API will throw an exception if allocation fails or will propagate whatever exception the allocator throws. The allocator must use the same heap as theheap
config option. -
dealloc
Must be a function semantically compatible with Emscripten'sModule._free()
. That is, it takes a pointer returned fromalloc()
and releases that memory. It must never throw and must accept a value of 0/null to mean "do nothing" (noting that 0 is technically a legal memory address in WASM, but that seems like a design flaw). -
bigIntEnabled
(bool=true if BigInt64Array is available, else false)
If true, the WASM bits this code is used with must have been compiled with int64 support (e.g. using Emscripten's-sWASM_BIGINT
flag). If that's not the case, this flag should be set to false. If it's enabled, BigInt support is assumed to work and certain extra features are enabled. Trying to use features which requires BigInt when it is disabled (e.g. using 64-bit integer types) will trigger an exception. -
memberPrefix
andmemberSuffix
(string="")
If set, struct-defined properties get bound to JS with this string as a prefix resp. suffix. This can be used to avoid symbol name collisions between the struct-side members and the JS-side ones and/or to make more explicit which object-level properties belong to the struct mapping and which to the JS side. This does not modify the values in the struct description objects, just the property names through which they are accessed via property access operations and the various a StructInstance APIs (noting that the latter tend to permit both the original names and the names as modified by these settings). -
log
Optional function used for debugging output. By defaultconsole.log
is used but by default no debug output is generated. This API assumes that the function will space-separate each argument (likeconsole.log
does). See Appendix D for info about enabling debugging output.
API: Struct Binder
Struct Binders are factories which are created by the StructBinderFactory. A given Struct Binder can process any number of distinct structs. In a typical setup, an app will have ony one shared Binder Factory and one Struct Binder. Struct Binders which are created via different StructBinderFactory calls are unrelated to each other, sharing no state except, perhaps, indirectly via StructBinderFactory configuration (e.g. the memory heap).
These factories have two call signatures:
Function StructBinder([string structName,] object structDescription)
If the struct description argument has a name
property then the name
argument is optional, otherwise it is required.
The returned object is a constructor for instances of the struct described by its argument(s), each of which derives from a separate StructType instance.
The Struct Binder has the following members:
-
allocCString(str)
Allocates a new UTF-8-encoded, NUL-terminated copy of the given JS string and returns its address relative toconfig.heap()
. If allocation returns 0 this function throws. Ownership of the memory is transfered to the caller, who must eventually pass it to the configuredconfig.dealloc()
function. -
config
The configuration object passed to the StructBinderFactory, primarily for accessing the memory (de)allocator and memory. Modifying any of its "significant" configuration values may have undefined results.
API: Struct Type
The StructType class is a property of the StructBinder function.
Each constructor created by a StructBinder inherits from its own instance of the StructType class, which contains state specific to that struct type (e.g. the struct name and description metadata). StructTypes which are created via different StructBinder instances are unrelated to each other, sharing no state except StructBinderFactory config options.
The StructType constructor cannot be called from client code. It is
only called by the StructBinder-generated
constructors. The StructBinder.StructType
object
has the following "static" properties (^Which are accessible from
individual instances via theInstance.constructor
.):
-
addOnDispose(...value)
\
If this object has noondispose
property, this function creates it as an array and pushes the given value(s) onto it. If the object has a function-typedondispose
property, this call replaces it with an array and moves that function into the array. In all other cases,ondispose
is assumed to be an array and the argument(s) is/are appended to it. Returnsthis
. -
allocCString(str)
Identical to the StructBinder method of the same name. -
hasExternalPointer(object)
Returns true if the given object'spointer
member refers to an "external" object. That is the case when a pointer is passed to a struct's constructor. If true, the memory is owned by someone other than the object and must outlive the object. -
isA(value)
Returns true if its argument is a StructType instance from the same StructBinder as this StructType. -
memberKey(string)
Returns the given string wrapped in the configuredmemberPrefix
andmemberSuffix
values. e.g. if passed"x"
andmemberPrefix
is"$"
then it returns"$x"
. This does not verify that the property is actually a struct a member, it simply transforms the given string. TODO(?): add a 2nd parameter indicating whether it should validate that it's a known member name.
The base StructType prototype has the following members, all of which are inherited by struct instances and may only legally be called on concrete struct instances unless noted otherwise:
-
dispose()
Frees, if appropriate, the WASM-allocated memory which is allocated by the constructor. If this is not called before the JS engine cleans up the object, a leak in the WASM heap memory pool will result.
Whendispose()
is called, if the object has a property namedondispose
then it is treated as follows:- If it is a function, it is called with the struct object as its
this
. That method must not throw - if it does, the exception will be ignored. - If it is an array, it may contain functions, pointers, other
StructType instances, and/or JS strings. If an entry is a
function, it is called as described above. If it's a number, it's
assumed to be a pointer and is passed to the
dealloc()
function configured for the parent StructBinder. If it's a StructType instance then itsdispose()
method is called. If it's a JS string, it's assumed to be a helpful description of the next entry in the list and is simply ignored. Strings are supported primarily for use as debugging information. - Some struct APIs will manipulate the
ondispose
member, creating it as an array or converting it from a function to array as needed.
- If it is a function, it is called with the struct object as its
-
lookupMember(memberName,throwIfNotFound=true)
Given the name of a mapped struct member, it returns the member description object. If not found, it either throws (if the 2nd argument is true) or returnsundefined
(if the second argument is false). The first argument may be either the member name as it is mapped in the struct description or that same name with the configuredmemberPrefix
andmemberSuffix
applied, noting that the lookup in the former case is faster.\
This method may be called directly on the prototype, without a struct instance. -
memberToJsString(memberName)
Usesthis.lookupMember(memberName,true)
to look up the given member. If its signature iss
then it is assumed to refer to a NUL-terminated, UTF-8-encoded string and its memory is decoded as such. If its signature is not one of those then an exception is thrown. If its address is 0,null
is returned. See also:setMemberCString()
. -
memberIsString(memberName [,throwIfNotFound=true])
Usesthis.lookupMember(memberName,throwIfNotFound)
to look up the given member. Returns the member description object if the member has a signature ofs
, else returns false. If the given member is not found, it throws if the 2nd argument is true, else it returns false. -
memberKey(string)
Works identically toStructBinder.StructType.memberKey()
. -
memberKeys()
Returns an array of the names of the properties of this object which refer to C-side struct counterparts. -
memberSignature(memberName [,emscriptenFormat=false])
Returns the signature for a given a member property, either in this framework's format or, if passed a truthy 2nd argument, in a format suitable for the 2nd argument to Emscripten'saddFunction()
. Throws if the first argument does not resolve to a struct-bound member name. The member name is resolved usingthis.lookupMember()
and throws if the member is found mapped. -
memoryDump()
Returns a Uint8Array which contains the current state of this object's raw memory buffer. Potentially useful for debugging, but not much else. Note that the memory is necessarily, for compatibility with C, written in the host platform's endianness and is thus not useful as a persistent/portable serialization format. -
setMemberCString(memberName,str)
UsesStructType.allocCString()
to allocate a new C-style string, assign it to the given member, and add the new string to this object'sondispose
list for cleanup whenthis.dispose()
is called. This function throws iflookupMember()
fails for the given member name, if allocation of the string fails, or if the member has a signature value of anything other thans
. Returnsthis
.
Achtung: calling this repeatedly will not immediately free the previous values because this code cannot know whether they are in use in other places, namely C. Instead, each time this is called, the prior value is retained in theondispose
list for cleanup when the struct is disposed of. Because of the complexities and general uncertainties of memory ownership and lifetime in such constellations, it is recommended that the use of C-string members from JS be kept to a minimum or that the relationship be one-way: let C manage the strings and only fetch them from JS using, e.g.,memberToJsString()
.
API: Struct Constructors
Struct constructors (the functions returned from StructBinder) are used for, intuitively enough, creating new instances of a given struct type:
const x = new MyStruct;
Normally they should be passed no arguments, but they optionally accept a single argument: a WASM heap pointer address of memory which the object will use for storage. It does not take over ownership of that memory and that memory must be valid at for least as long as this struct instance. This is used, for example, to proxy static/shared C-side instances:
const x = new MyStruct( someCFuncWhichReturnsAMyStructPointer() );
...
x.dispose(); // does NOT free the memory
The JS-side construct does not own the memory in that case and has no way of knowing when the C-side struct is destroyed. Results are specifically undefined if the JS-side struct is used after the C-side struct's member is freed.
Potential TODO: add a way of passing ownership of the C-side struct to the JS-side object. e.g. maybe simply pass
true
as the second argument to tell the constructor to take over ownership. Currently the pointer can be taken over using something likemyStruct.ondispose=[myStruct.pointer]
immediately after creation.
These constructors have the following "static" members:
-
isA(value)
Returns true if its argument was created by this constructor. -
memberKey(string)
Works exactly as documented for StructType. -
memberKeys(string)
Works exactly as documented for StructType. -
structInfo
The structure description passed to StructBinder when this constructor was generated. -
structName
The structure name passed to StructBinder when this constructor was generated.
API: Struct Prototypes
The prototypes of structs created via the constructors described in the previous section are each a struct-type-specific instance of StructType and add the following struct-type-specific properties to the mix:
-
structInfo
The struct description metadata, as it was given to the StructBinder which created this class. -
structName
The name of the struct, as it was given to the StructBinder which created this class.
API: Struct Instances
Instances of structs created via the constructors described above each have the following instance-specific state in common:
pointer
A read-only numeric property which is the "pointer" returned by the configured allocator when this object is constructed. Afterdispose()
(inherited from StructType) is called, this property has theundefined
value. When calling C-side code which takes a pointer to a struct of this type, simply pass itmyStruct.pointer
.
Appendices
Appendix A: Limitations, TODOs, and Non-TODOs
-
This library only supports the basic set of member types supported by WASM: numbers (which includes pointers). Nested structs are not handled except that a member may be a pointer to such a struct. Whether or not it ever will depends entirely on whether its developer ever needs that support. Conversion of strings between JS and C requires infrastructure specific to each WASM environment and is not directly supported by this library.
-
Binding functions to struct instances, such that C can see and call JS-defined functions, is not as transparent as it really could be, due to shortcomings in the Emscripten
addFunction()
/removeFunction()
interfaces. Until a replacement for that API can be written, this support will be quite limited. It is possible to bind a JS-defined function to a C-side function pointer and call that function from C. What's missing is easier-to-use/more transparent support for doing so.- In the meantime, a standalone subproject of Jaccwabyt provides such a binding mechanism, but integrating it directly with Jaccwabyt would not only more than double its size but somehow feels inappropriate, so experimentation is in order for how to offer that capability via completely optional StructBinderFactory config options.
-
It "might be interesting" to move access of the C-bound members into a sub-object. e.g., from JS they might be accessed via
myStructInstance.s.structMember
. The main advantage is that it would eliminate any potential confusion about which members are part of the C struct and which exist purely in JS. "The problem" with that is that it requires internally mapping thes
member back to the object which contains it, which makes the whole thing more costly and adds one more moving part which can break. Even so, it's something to try out one rainy day. Maybe even make it optional and make thes
name configurable via the StructBinderFactory options. (Over-engineering is an arguably bad habit of mine.) -
It "might be interesting" to offer (de)serialization support. It would be very limited, e.g. we can't serialize arbitrary pointers in any meaningful way, but "might" be useful for structs which contain only numeric or C-string state. As it is, it's easy enough for client code to write wrappers for that and handle the members in ways appropriate to their apps. Any impl provided in this library would have the shortcoming that it may inadvertently serialize pointers (since they're just integers), resulting in potential chaos after deserialization. Perhaps the struct description can be extended to tag specific members as serializable and how to serialize them.
Appendix D: Debug Info
The StructBinderFactory, StructBinder, and StructType classes all have the following "unsupported" method intended primarily to assist in their own development, as opposed to being for use in client code:
debugFlags(flags)
(integer)
An "unsupported" debugging option which may change or be removed at any time. Its argument is a set of flags to enable/disable certain debug/tracing output for property accessors: 0x01 for getters, 0x02 for setters, 0x04 for allocations, 0x08 for deallocations. Pass 0 to disable all flags and pass a negative value to completely clear all flags. The latter has the side effect of telling the flags to be inherited from the next-higher-up class in the hierarchy, with StructBinderFactory being top-most, followed by StructBinder, then StructType.
Appendix G: Generating Struct Descriptions From C
Struct definitions are ideally generated from WASM-compiled C, as
opposed to simply guessing the sizeofs and offsets, so that the sizeof
and offset information can be collected using C's sizeof()
and
offsetof()
features (noting that struct padding may impact offsets
in ways which might not be immediately obvious, so writing them by
hand is most certainly not recommended).
How exactly the desciption is generated is necessarily project-dependent. It's tempting say, "oh, that's easy! We'll just write it by hand!" but that would be folly. The struct sizes and byte offsets into the struct must be precisely how C-side code sees the struct or the runtime results are completely undefined.
The approach used in developing and testing this software is...
Below is a complete copy/pastable example of how we can use a small
set of macros to generate struct descriptions from C99 or later into
static string memory. Simply add such a file to your WASM build,
arrange for its function to be exported1, and call it
from JS (noting that it requires environment-specific JS glue to
convert the returned pointer to a JS-side string). Use JSON.parse()
to process it, then feed the included struct descriptions into the
binder factory at your leisure.
#include <string.h> /* memset() */
#include <stddef.h> /* offsetof() */
#include <stdio.h> /* snprintf() */
#include <stdint.h> /* int64_t */
#include <assert.h>
struct ExampleStruct {
int v4;
void * ppV;
int64_t v8;
void (*xFunc)(void*);
};
typedef struct ExampleStruct ExampleStruct;
const char * wasm__ctype_json(void){
static char strBuf[512 * 8] = {0}
/* Static buffer which must be sized large enough for
our JSON. The string-generation macros try very
hard to assert() if this buffer is too small. */;
int n = 0, structCount = 0 /* counters for the macros */;
char * pos = &strBuf[1]
/* Write-position cursor. Skip the first byte for now to help
protect against a small race condition */;
char const * const zEnd = pos + sizeof(strBuf)
/* one-past-the-end cursor (virtual EOF) */;
if(strBuf[0]) return strBuf; // Was set up in a previous call.
////////////////////////////////////////////////////////////////////
// First we need to build up our macro framework...
////////////////////////////////////////////////////////////////////
// Core output-generating macros...
#define lenCheck assert(pos < zEnd - 100)
#define outf(format,...) \
pos += snprintf(pos, ((size_t)(zEnd - pos)), format, __VA_ARGS__); \
lenCheck
#define out(TXT) outf("%s",TXT)
#define CloseBrace(LEVEL) \
assert(LEVEL<5); memset(pos, '}', LEVEL); pos+=LEVEL; lenCheck
////////////////////////////////////////////////////////////////////
// Macros for emiting StructBinders...
#define StructBinder__(TYPE) \
n = 0; \
outf("%s{", (structCount++ ? ", " : "")); \
out("\"name\": \"" # TYPE "\","); \
outf("\"sizeof\": %d", (int)sizeof(TYPE)); \
out(",\"members\": {");
#define StructBinder_(T) StructBinder__(T)
// ^^^ extra indirection needed to expand CurrentStruct
#define StructBinder StructBinder_(CurrentStruct)
#define _StructBinder CloseBrace(2)
#define M(MEMBER,SIG) \
outf("%s\"%s\": " \
"{\"offset\":%d,\"sizeof\": %d,\"signature\":\"%s\"}", \
(n++ ? ", " : ""), #MEMBER, \
(int)offsetof(CurrentStruct,MEMBER), \
(int)sizeof(((CurrentStruct*)0)->MEMBER), \
SIG)
// End of macros.
////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////
// With that out of the way, we can do what we came here to do.
out("\"structs\": ["); {
// For each struct description, do...
#define CurrentStruct ExampleStruct
StructBinder {
M(v4,"i");
M(ppV,"p");
M(v8,"j");
M(xFunc,"v(p)");
} _StructBinder;
#undef CurrentStruct
} out( "]"/*structs*/);
////////////////////////////////////////////////////////////////////
// Done! Finalize the output...
out("}"/*top-level wrapper*/);
*pos = 0;
strBuf[0] = '{'/*end of the race-condition workaround*/;
return strBuf;
// If this file will ever be concatenated or #included with others,
// it's good practice to clean up our macros:
#undef StructBinder
#undef StructBinder_
#undef StructBinder__
#undef M
#undef _StructBinder
#undef CloseBrace
#undef out
#undef outf
#undef lenCheck
}
-
In Emscripten, add its name, prefixed with
_
, to the project'sEXPORT_FUNCTIONS
list. [BigInt64Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt64Array [TextDecoder]: https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder [TextEncoder]: https://developer.mozilla.org/en-US/docs/Web/API/TextEncoder [MDN]: https://developer.mozilla.org/docs/Web/API ↩︎