## Operational Transformation

Back to I was in a startup company two years ago, we were going to develop an online video chatroom over WebRTC with a collaborative real-time editor to share codes or thoughts for online interviews. Starting to build the editor from scratch seemed quite challenging for us at that time, so we chose etherpad-lite, an … Continue reading "Operational Transformation"

Read More## 「Designing Data-Intensive Applications」Chapter 6

Partitioning Normally, partitions are defined in such a way that each piece of data belongs to exactly one partition. Partitioning and Replication Partitioning is usually combined with replication so that copies of each partition are stored on multiple nodes. A node may store more than one partition. If a leader–follower replication model is used, the combination … Continue reading "「Designing Data-Intensive Applications」Chapter 6"

Read More## [Algorithm 101] Topological Sorting

A directed acyclic graph (DAG) is not only necessary but also sufficient for topological sorting to be possible. DFS Perform a DFS traversal and note the order in which vertices become dead-ends (i.e., popped off the traversal stack). Reversing this order yields a solution to the topological sorting problem, provided, of course, no back edge has been encountered … Continue reading "[Algorithm 101] Topological Sorting"

Read More## Matrix Multiplication: A Programmer’s Perspective

The problem find the nth fibonacci number(A000045) has an optimal solution: matrix multiplication. Similarly, the matrix representation of sequence An = An-1 + An-2 + An-3 (A000073) is: By using matrix multiplication, we can reduce the time complexity from O(n) to O(logn). In linear algebra textbooks, matrix multiplication is the composition of two linear functions. Suppose: Represent … Continue reading "Matrix Multiplication: A Programmer’s Perspective"

Read More## 「Designing Data-Intensive Applications」Chapter 5

Replication Leaders and Followers Synchronous Versus Asynchronous Replication An important detail of a replicated system is whether the replication happens synchronously or asynchronously. It is impractical for all followers to be synchronous: any one node outage would cause the whole system to grind to a halt. In practice, if you enable synchronous replication on … Continue reading "「Designing Data-Intensive Applications」Chapter 5"

Read More## 「Designing Data-Intensive Applications」Chapter 4

Encoding and Evolution In order for the system to continue running smoothly, we need to maintain compatibility in both directions: Backward compatibility: newer code can read data that was written by older code. Forward compatibility: older code can read data that was written by newer code. Formats for Encoding Data Programs usually work with … Continue reading "「Designing Data-Intensive Applications」Chapter 4"

Read More## Online Judge from Scratch(3) – Sandbox

This article consists of two parts: the sandbox for GCC and the sandbox for the compiled binary. the sandbox for GCC Essentially, our sandbox for GCC is a wrapper of GCC with a watchdog, just like the sandbox we designed for the Java compiler. However, there are more situations need to be considered carefully for GCC. In … Continue reading "Online Judge from Scratch(3) – Sandbox"

Read More## Online Judge from Scratch(2) – Dispatcher

The dispatcher, as the name implies, fetches judge tasks from RabbitMQ, dispatches them to the sandbox workers and gets the results back synchronously. In Justice, the sandboxes are language-specific: If the submission is written in Java, we can sandbox it with Java Security Manager. If the submission is written in C/CPP, we need another sandbox … Continue reading "Online Judge from Scratch(2) – Dispatcher"

Read More## 「Designing Data-Intensive Applications」Chapter 3

Storage and Retrieval Data Structures That Power Your Database An index is an additional structure that is derived from the primary data. An important trade-off in storage systems: well-chosen indexes speed up read queries, but every index slows down writes. Hash Indexes Keep an in-memory hash map where every key is mapped to a byte … Continue reading "「Designing Data-Intensive Applications」Chapter 3"

Read More## 「Designing Data-Intensive Applications」Chapter 2

Data Models and Query Languages The limits of my language mean the limits of my world. — Ludwig Wittgenstein, Tractatus Logico-Philosophicus (1922) Data models are perhaps the most important part of developing software, because they have such a profound effect: not only on how the software is … Continue reading "「Designing Data-Intensive Applications」Chapter 2"

Read More