「Designing Data-Intensive Applications」Chapter 6

Partitioning Normally, partitions are defined in such a way that each piece of data belongs to exactly one partition. Partitioning and Replication Partitioning is usually combined with replication so that copies of each partition are stored on multiple nodes. A node may store more than one partition. If a leader–follower replication model is used, the combination … Continue reading "「Designing Data-Intensive Applications」Chapter 6"

Read More

[Algorithm 101] Topological Sorting

A directed acyclic graph (DAG) is not only necessary but also sufficient for topological sorting to be possible. DFS Perform a DFS traversal and note the order in which vertices become dead-ends (i.e., popped off the traversal stack). Reversing this order yields a solution to the topological sorting problem, provided, of course, no back edge has been encountered … Continue reading "[Algorithm 101] Topological Sorting"

Read More

Matrix Multiplication: A Programmer’s Perspective

The problem find the nth fibonacci number(A000045) has an optimal solution: matrix multiplication. Similarly, the matrix representation of sequence An = An-1 + An-2 + An-3 (A000073) is: By using matrix multiplication, we can reduce the time complexity from O(n) to O(logn). In linear algebra textbooks, matrix multiplication is the composition of two linear functions. Suppose: Represent … Continue reading "Matrix Multiplication: A Programmer’s Perspective"

Read More

「Designing Data-Intensive Applications」Chapter 5

Replication Leaders and Followers   Synchronous Versus Asynchronous Replication An important detail of a replicated system is whether the replication happens synchronously or asynchronously. It is impractical for all followers to be synchronous: any one node outage would cause the whole system to grind to a halt. In practice, if you enable synchronous replication on … Continue reading "「Designing Data-Intensive Applications」Chapter 5"

Read More

「Designing Data-Intensive Applications」Chapter 4

Encoding and Evolution   In order for the system to continue running smoothly, we need to maintain compatibility in both directions: Backward compatibility: newer code can read data that was written by older code. Forward compatibility: older code can read data that was written by newer code. Formats for Encoding Data Programs usually work with … Continue reading "「Designing Data-Intensive Applications」Chapter 4"

Read More

Online Judge from Scratch(2) – Dispatcher

The dispatcher, as the name implies, fetches judge tasks from RabbitMQ, dispatches them to the sandbox workers and gets the results back synchronously. In Justice, the sandboxes are language-specific: If the submission is written in Java, we can sandbox it with Java Security Manager. If the submission is written in C/CPP, we need another sandbox … Continue reading "Online Judge from Scratch(2) – Dispatcher"

Read More

「Designing Data-Intensive Applications」Chapter 3

Storage and Retrieval Data Structures That Power Your Database An index is an additional structure that is derived from the primary data. An important trade-off in storage systems: well-chosen indexes speed up read queries, but every index slows down writes. Hash Indexes Keep an in-memory hash map where every key is mapped to a byte … Continue reading "「Designing Data-Intensive Applications」Chapter 3"

Read More

「Designing Data-Intensive Applications」Chapter 2

Data Models and Query Languages The limits of my language mean the limits of my world.                  — Ludwig Wittgenstein, Tractatus Logico-Philosophicus (1922) Data models are perhaps the most important part of developing software, because they have such a profound effect: not only on how the software is … Continue reading "「Designing Data-Intensive Applications」Chapter 2"

Read More