At Electric Era, we love diving deep into systems architecture, especially when it means making things more efficient and testable. This is at the core of how we enable industry-leading reliability for our drivers and customers.
Recently, we've been tackling a challenge that's close to the hearts of developers that work on complex distributed systems: how to scale our finite state machines (FSMs) beyond a single binary and into distributed processes (sometimes even on other computers) while keeping everything as clean and testable as possible. Here's how we’re doing it.
Right now, we run FSMs within a single binary using shared memory structures for input and output. This works great because:
This setup has been great for components strictly following the FSM model—they're a breeze to test. But as our system grows more complex and distributed, the less-structured modules are starting to feel like outliers. We want every module to benefit from strict, testable interfaces.
Enter the Event Bus. This new architecture is all about managing state flow cleanly within an application and extending that flow across distributed systems. Here’s the gist:
An Event Bus handles state snapshots (aka “Events”) within an application, providing a clean way to publish and subscribe to state changes.
This ensures components always see the latest state every control cycle (e.g., every 20ms for our control loop).
To scale this model across nodes, Event Buses can now talk to each other:
Each Event includes metadata:
Node_ID
: Who published it.TimeReceived
: When the event was received by the bus.This metadata helps us detect stale state (and, indirectly, communication loss).
Below is what the whole system looks like in graph form, as a simple illustration.
To efficiently handle Events both in memory and over the network, we’re exploring serialization options. The two contenders:
Example Schema:
1namespace Disco:struct EBMetaData {
2 __eb_node_id:string;
3 __eb_time_rx_ns:uint64;
4}
5
6table VehicleRequestedValues {
7 metadata:EBMetaData;
8 current:float;
9 voltage:float;
10}
Performance testing will ultimately decide which one wins, but FlatBuffers has a slight edge due to its speed and zero-copy potential. It demonstrates a strong capability to perform well in a resource-constrained environment.
This new approach unlocks several powerful benefits:
We’re thrilled to see how this will streamline our systems and are eager to share our findings (and maybe even some open-source tools) with the developer community.
We’re still refining aspects like how to allocate and route Events across nodes and finalizing our serialization format. But we’re confident that this architecture will set a new standard for scalable, distributed FSMs.
If you’re also tackling distributed systems or have thoughts on serialization formats, we’d love to hear from you. Let’s build better systems together! 🚀
Stay tuned for updates, and happy coding!