Concurrency
pntOS is designed to allow for concurrent operation of different plugins. Indeed, different plugins are free to use their own concurrency primitives internally. Because each plugin may be using its own set of threads, processes, coroutines, or other primitives internally, some rules must be established on how these plugins interact with each other.
Definitions
Fundamentally, pntOS plugins interact with resources handed to them by the
controller (see the pntOS Introduction for more information on the overall
pntOS architecture). For example, when a non-Controller plugin is first loaded into pntOS,
the init_plugin call provides a pointer to a
Mediator struct, which may be used by that plugin to request information from
the system. Similarly, the Orchestration plugin is handed a set of plugins in
the init_orchestration_plugin call, which
the Orchestration plugin is free to use in the future to access system
resources. A plugin using a function on the Mediator it was given and the
Orchestration plugin using a plugin from the list of plugins it was given are
both examples of plugins using system resources given to them by the system.
Conversely, when the Controller plugin starts up it is handed a set of raw
plugins collected by the loader. The Controller plugin wants to use these
plugins, but it first must ensure that it adheres to the rules and expectations
of these plugins. For example, before the Controller plugin may start using
functionality on a Transport plugin, it is required to call
init_plugin on that Transport plugin. When the
Controller plugin accesses a raw plugin or a resource returned by a raw
plugin, that constitutes the pntOS system accessing a plugin resource. The above
example of the Controller plugin calling init_plugin on a
Transport plugin is an example of the system accessing a plugin resource.
Thus, we can say that the Controller plugin is acting as the “pntOS System” and
managing the concurrency and data access between plugins. We will now consider
both of these use cases individually:
What responsibilities do non-
Controller plugins have in accessing system resourcesWhat responsibilities does the
Controller pluginplugin have in accessing plugin resources
Plugin Accessing System Resources
Plugins may access system resources at any time. For example, they are free to
call functions on the Mediator and on a PntosKeyValueStore instance. If they
make such accesses concurrently, they are subject to the following rules:
Concurrent access of each system resource from separate threads owned and created by the plugin is allowed.
Concurrent access of system resources from separate processes is forbidden. For example, if a plugin called
fork(), it cannot access theMediatorfrom both processes. Access of system resources is only allowed from the originating process that the system resources were given to the plugin in.Concurrent access of system resources may be processed in any order. The system reserves the right to:
Block on a system resource access request while waiting for another call to complete.
Execute system resource access requests in a different order than they were made by the plugin.
Utilize each thread that a system resource access request was made on to do other work. For example, if a registry key is set using
set_str, the system may in the middle of that call process itsrequest_notifyobservers, using that thread to invoke callbacks on other plugins.Utilize any thread-local storage on the thread that a system resource access request was made for any purpose.
One method of accessing system resources is to have a dedicated system resource thread which sequences and makes all system resource requests. Concurrent access from multiple threads is also possible as long as the above rules are observed. Note that the first two rules make any such concurrent accesses a race condition.
Concurrent access of system resources using other concurrency primitives (such as coroutines) is unspecified and must be coordinated with a participating controller. For example, a controller might share a coroutine access pool with other plugins, but only if all plugins opt-in. Plugins must support falling back to a non-participation mode, where other concurrency primitives are not utilized to make concurrent accesses of system resources.
Controller Accessing Plugin Resources
The system may access resources on a plugin or returned by a
plugin (or recursively returned by a resource previously returned by a plugin)
at any time. Such accesses are usually made by the Controller plugin accessing
plugin memory or functions, and are subject to the following rules:
Access to a plugin resource must be on a single thread, and concurrent accesses to plugin resources are forbidden. Thus, a controller must carefully control requests made to plugins by the
Mediatorand other inter-plugin communications.Access to plugin resources may be made on any thread available to the system.
Plugin resources may be accessed from within any process and are not necessarily used within the same process that loaded the plugin, however each plugin must only be called from one process. For example, the controller is free to
fork()once per plugin and use each plugin in a separate process, but it must not call the same plugin from within both the original process and thefork()ed process.Concurrent access of plugin resources using other concurrency primitives (such as coroutines) is unspecified and must be coordinated with a participating plugin. For example, a controller might share a coroutine access pool with other plugins, but only if all plugins opt-in. Plugins must support falling back to a non-participation mode, where other concurrency primitives are not utilized to make concurrent accesses of plugin resources.
Any of the above rules may be overridden by an API specification. For example,
the registry allows for PntosKeyValueStore accesses to be made
concurrently, and the PntosUiPlugin requires the main thread to be used
under certain conditions.
Simultaneous Controller and Plugin Accesses
In the last two sections, we discussed the rules for the Controller plugin
accessing plugin resources and the plugin accessing system resources. However,
what if these two things happen simultaneously? That is, a Controller plugin
accesses a plugin resource while at the same time the plugin accesses a system
resource?
In general, such behavior is allowed, meaning that:
Plugins must expect that one of their functions may be accessed even if they are currently requesting something from the system. The plugin must not block on the system accessing one of its functions until its current request to the system is complete.
The
Controller pluginmust expect the plugin to request something from the system even if the controller is currently waiting for a call it has made to the plugin to complete. TheController pluginmust not block on the plugin request until after theController plugin’s request to the plugin is complete.
Taken in whole, this allows for calls one direction to initialize a call the other
direction. For example, if the Controller plugin asks plugin A to perform a task,
plugin A may call back into the mediator while it is trying to perform that
task. An example use case is that plugin A may need to get a config value from
the registry to complete its task. The mediator must dispatch the request for a
registry config immediately, as waiting for plugin A’s task to complete before
processing the new config variable would result in a deadlock. In particular,
plugin A would be waiting for its config variable to complete the task and the
Controller plugin would be waiting for the task to complete before giving A
its config variable.
Callback Triggering and Call Loops
Because pntOS is a general framework without specific workflow requirements in
data routing between plugins, care must be taken to avoid responsibility loops.
For example, suppose that in the implementation of a Mediator, the system uses
a Registry plugin to implement the mediator->registry functionality.
Further suppose that in the implementation of a Registry plugin, the
set_str call simply called back into the
Mediator’s mediator->registry. We would now be in an infinite loop:
the Mediator uses the Registry plugin and the Registry plugin uses the
Mediator function, continuing to call each other until a stack overflow
occurs.
While this example is simple and avoidable, there are many possible races,
deadlocks, and starvation conditions that can arise in the implementation of the
mediator if plugins are allowed to call any mediator resources at any time.
Consider for example the following call chain that might occur when the
Controller plugin requests something from plugin A:
Controller -> Plugin A -> Mediator -> Plugin B -> Mediator -> Plugin A
Plugin A calling the mediator for Plugin B’s functionality is reasonable, and Plugin B calling the mediator for Plugin A’s functionality is reasonable, but in totality we would have a broken system, as by the concurrency rules in the previous sections the system may not call into Plugin A twice.
The solution to this issue is isolation of responsibilities. In general, the following rules must be followed:
Plugins must not call back into functions which they are responsible for implementing. For example, a
Registry pluginmay not access theMediator’smediator->registry. Similarly, theOrchestration pluginmust not call intorequest_solutions, and theTransport pluginmust not call intobroadcast_aspn_message.Callbacks must not use
Mediatorresources. For example, when the callback function torequest_notifyis invoked, the callback may not access theMediatorinside the callback.
Mutability of Received Resources
In general, resources delivered by a plugin to the system or by the system to a plugin may be shared by multiple plugins, including the original plugin that produced it retaining a copy. pntOS utilizes a reference counting mechanism to capture how many references to a resource are retained by any particular plugin.
By default, all plugins must assume that resources are being shared. Thus if a
PntosString is returned by a function call to another plugin, the receiver
plugin must assume that the data memory is being
utilized by multiple plugins and cannot be modified or mutated. However, for
memory which is managed by a PntosManagedMemory, the receiver plugin may
attempt to call the PntosManagedMemory::num_refs member of
PntosManagedMemory. If the function pointer is non-NULL and the returned value
is 1, then there exists no other reference to the managed memory and therefore
the receiver plugin is free to mutate the memory contained in the PntosString.
This leads us to the following set of rules all plugins must follow:
All memory passed to another plugin which the originating plugin does not retain a reference to must be mutable. If the originating plugin wants to use a C string literal in a string field, then it must either copy it first to mutable memory or retain a reference such that no other plugin will ever see a 1 return from
PntosManagedMemory::num_refs.Data in a
Pntos*struct must not be mutated unlessPntosManagedMemory::num_refsis 1.If memory is returned with nested structs which are managed by nested
PntosManagedMemorymanagers, mutation of the nested memory is never allowed. This is because:Each
PntosManagedMemoryretains its own reference count and therefore the outer and innerPntosManagedMemoryboth constitute a potential reference held by another pluginThere is no way to atomically test
PntosManagedMemory::num_refsfor both the outer and innerPntosManagedMemory, and sequential testing would constitute a TOCTOU vulnerability.
PIP and Controller
The PIP and Controller plugin work closely to handle system resources
concurrently in pntOS. Because of this close relationship, there is no way to
prescriptively lay out a set of rules the PIP and Controller plugin must
adhere to in order to avoid races, deadlocks, and other undesirable effects.
Instead, for concurrent implementations, the PIP must be designed to work
specifically with a chosen Controller plugin, and document the way that it
coordinates concurrency with the Controller plugin.