rDSN 0,0,3,0,1,0,2,0 travis-ci

Robust Distributed System Nucleus (rDSN) is an open framework for quickly building and managing high performance and robust distributed systems.

2 years after MIT

Build Status Build status

Robust Distributed System Nucleus (rDSN) is a framework for quickly building robust distributed systems. It has a microkernel for pluggable components, including applications, distributed frameworks, devops tools, and local runtime/resource providers, enabling their independent development and seamless integration. The project was originally developed for Microsoft Bing, and now has been adopted in production both inside and outside Microsoft.

Top Links

  • [Case] RocksDB made replicated using rDSN!
  • [Tutorial] A one-box cluster demo to understand how rDSN helps service registration, deployment, monitoring etc..
  • [Tutorial] Build a counter service with built-in tools (e.g., codegen, auto-test, fault injection, bug replay, tracing)
  • [Tutorial] Build a scalable and reliable counter service with built-in replication support
  • API Reference
  • Installation

Existing pluggable modules (and growing)

The core of rDSN is a service kernel with which we can develop (via Service API and Tool API) and plugin lots of different application, framework, tool, and local runtime modules, so that they can seamlessly benefit each other. Here is an incomplete list of the pluggable modules.

Pluggable modules Description Release
dsn.core rDSN service kernel todo
dsn.dist.service.stateless scale-out and fail-over for stateless services (e.g., micro services) todo
dsn.dist.service.stateful.type1 scale-out, replicate, and fail-over for stateful services (e.g., storage) todo
dsn.dist.service.meta_server membership, load balance, and machine pool management for the above service frameworks todo
dsn.dist.uri.resolver a client-side helper module that resolves service URL to target machine todo
dsn.dist.traffic.router fine-grain RPC request routing/splitting/forking to multiple services (e.g., A/B test) todo
dsn.tools.common deployment runtime (e.g., network, aio, lock, timer, perf counters, loggers) for both Windows and Linux; simple toollets, such as tracer, profiler, and fault-injector todo
dsn.tools.nfs an implementation of remote file copy based on rpc and aio todo
dsn.tools.emulator an emulation runtime for whole distributed system emulation with auto-test, replay, global state checking, etc. todo
dsn.tools.hpc high performance counterparts for the modules as implemented in tools.common todo
dsn.tools.explorer extracts task-level dependencies automatically todo
dsn.tools.log.monitor collect critical logs (e.g., log-level >= WARNING) in cluster todo
dsn.app.simple_kv an example application module todo

Scenarios by different module combination and configuration

rDSN provides flexible configuration so that developers can combine and configure the modules differently to enable different scenarios. All modules are loaded by dsn.svchost, a common process runner in rDSN, with the given configuration file. The following table lists some examples (note dsn.core is always required therefore omitted in Modules column).

Scenarios Modules Config Demo
logic correctness development dsn.app.simple_kv + dsn.tools.emulator + dsn.tools.common config todo
logic correctness with failure dsn.app.simple_kv + dsn.tools.emulator + dsn.tools.common config todo
performance tuning dsn.app.simple_kv + dsn.tools.common config todo
progressive performance tuning dsn.app.simple_kv + dsn.tools.common + dsn.tools.emulator config todo
Paxos enabled stateful service dsn.app.simple_kv + dsn.tools.common + dsn.tools.emulator + dsn.dist.uri.resolver + dsn.dist.serivce.meta_server + dsn.dist.service.stateful.type1 config todo

There are a lot more possibilities. rDSN provides a web portal to enable quick deployment of these scenarios in a cluster, and allow easy operations through simple clicks as well as rich visualization. Deployment scenarios are defined here, and developers can add more on demand.

How does rDSN build robustness?

  • reduced system complexity via microkernel architecture: applications, frameworks (e.g., replication, scale-out, fail-over), local runtime libraries (e.g., network libraries, locks), and tools are all pluggable modules into a microkernel to enable independent development and seamless integration (therefore modules are reusable and transparently benefit each other)

    rDSN Architecture

  • flexible configuration with global deploy-time view: tailor the module instances and their connections on demand with configurable system complexity and resource allocation (e.g., run all nodes in one simulator for testing, allocate CPU resources appropriately for avoiding resource contention, debug with progressively added system complexity)

    rDSN Configuration

  • transparent tooling support: dedicated tool API for tool development; built-in plugged tools for understanding, testing, debugging, and monitoring the upper applications and frameworks

    rDSN Architecture

  • auto-handled distributed system challenges: built-in frameworks to achieve scalability, reliability, availability, and consistency etc. for the applications

    rDSN service model

Research papers

rDSN borrows the idea in many research work, from both our own and the others, and tries to make them real in production in a coherent way; we greatly appreciate the researchers who did these work.

License and Support

rDSN is provided on Windows and Linux, with the MIT open source license. You can use the "issues" tab in GitHub to report bugs.

Related Repositories



Big Data Made Easy ...



A tool plugin for rDSN automatically extracts upper application/framework's exec ...



Python dev library atop of rDSN's C service API ...



rocksdb replicated using rDSN (simple mode) ...

Top Contributors

imzhenyu qinzuoyan linmajia shengofsun mcfatealan ykwd glglwty goksyli xiaotz zjc95 lishenglong capfei


-   7c0fb83 zip tar