Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotate design document in preparation for update #9634

Closed
wants to merge 1 commit into from

Conversation

spencerkimball
Copy link
Member

@spencerkimball spencerkimball commented Sep 30, 2016

@mjibson @petermattis @mrtracy @tschottdorf @arjunravinarayan @andreimatei @jordanlewis @jseldess @sploiselle

All, you've been called to a higher purpose. Our design doc is in parlous condition. It's time to give it a miracle makeover. For those of you who feel the call of duty, we're going to organize a designdoc hackathon tomorrow at 1:30p to split design.md into areas of concern, parcel them out amongst ourselves, knock out the asbestos and water-stained drywall, and erect a modern, glass and steel structure in its place.

I've yet to delineate the missing sections (too late already tonight); we can spend some time on that tomorrow. But if we just get as far as dumping no-longer-applicable sections and cleaning up what remains, we'll be in significantly better shape.


This change is Reviewable

@petermattis
Copy link
Collaborator

I'm happy to take on some of the work, though 1:30pm on Friday's is rough for me due to various shuttling of kids to activities.

@spencerkimball
Copy link
Member Author

Why don't you comment in this PR for sections you'd like to take on.

In order to support diverse client usage, Cockroach clients connect to
any node via HTTPS using protocol buffers or JSON. The connected node
proxies involved client work including key lookups and write buffering.

# Keys

[Update referencing concepts in keys/constants.go required here]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can tackle this section. Will probably rename it to Internal Keys to delineate it from SQL.

@@ -633,6 +663,7 @@ A really good reference on tuning Linux installations with RocksDB is

# Range Metadata

[Peter: did you actually have numbers for the size of meta2 records?]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Easy enough to recompute, though it was below 256 bytes and that upper bound still seems reasonable. I can take this section as well.

@spencerkimball
Copy link
Member Author

+cc @knz




![Distributed SQL]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take this

such as schemas, tables, columns, and indexes. The structured data API
in turn depends on the [distributed key value store](#key-value-api),
which handles the details of range addressing to provide the abstraction
of a single, monolithic key value store. The distributed KV store
communicates with any number of physical cockroach nodes. Each node
contains one or more stores, one per physical device.

![SQL]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knz was volunteered

@tbg
Copy link
Member

tbg commented Sep 30, 2016

I'm happy to do whatever I'm being volunteered for.

@@ -64,15 +74,25 @@ grained data on the level of entity groups.

Cockroach implements a layered architecture. The highest level of
abstraction is the SQL layer (currently unspecified in this document).
[FIX]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I will

@@ -83,6 +103,7 @@ raft. The color coding shows associated range replicas.

![Ranges](media/ranges.png)

[Update RoachNode concept]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix references to RoachNode.

@@ -101,12 +122,14 @@ Up to `F` failures can be tolerated, where the total number of replicas `N = 2F

# Cockroach Client
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll delete this whole section.

@tbg
Copy link
Member

tbg commented Sep 30, 2016

I'll get to the Linearizability section.

@rjnn
Copy link
Contributor

rjnn commented Sep 30, 2016

I'll take Raft (+ Quiescing + Coalescing).

@@ -377,6 +401,7 @@ An exploration of retries with contention and abort times with abandoned
transaction is
[here](https://docs.google.com/document/d/1kBCu4sdGAnvLqpT-_2vaTbomNmX3_saayWEGYu1j7mQ/edit?usp=sharing).

[Update (conceptually) s/table/records/g]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tamird could you address this comment?

@@ -4,6 +4,9 @@ by Spencer Kimball from early 2014.

# Overview

[Spencer to edit overview. This stuff is so ripe it's starting to
stink: "structured data? WTF is that?]

Cockroach is a distributed key:value datastore (SQL and structured
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm rewriting the SQL reference here.

@@ -14,6 +17,10 @@ disruption and **no manual intervention**. Cockroach nodes are
symmetric; a design goal is **homogeneous deployment** (one binary) with
minimal configuration.

[Start at top of the new stack here: SQL, mention of distributed SQL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking this over

@@ -64,15 +74,25 @@ grained data on the level of entity groups.

Cockroach implements a layered architecture. The highest level of
abstraction is the SQL layer (currently unspecified in this document).
[FIX]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I will

@@ -1305,6 +1356,7 @@ write-optimized (HBase, Cassandra, SQLite3/LSM, CockroachDB).

## Architecture

[Keep this diagram; update if necessary]
CockroachDB implements a layered architecture, with various
subdirectories implementing layers as appropriate. The highest level of
abstraction is the [SQL layer][5], which depends
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewriting this to clean up the SQL references

@@ -1331,6 +1383,7 @@ replicas.

## Client Architecture

[Update this diagram]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking this over to explain SQL connections

@rjnn
Copy link
Contributor

rjnn commented Sep 30, 2016

Review status: 0 of 1 files reviewed at latest revision, 14 unresolved discussions, all commit checks successful.


docs/design.md, line 812 at r1 (raw file):

# Raft - Consistency of Range Replicas

[Raft experts review this nonsense below]

Taking this.


docs/design.md, line 878 at r1 (raw file):

## Relationship to Raft leadership

[Is this all still accurate?]

Taking this.


Comments from Reviewable

@@ -609,6 +638,7 @@ for further details.

# Node Storage

[Updates necessary below]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll clean this section up a bit. Remove the range tree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #9652

@@ -660,6 +691,7 @@ is sparse, the successor key is defined as the next key which is present. The
found using the same process. The *meta2* record identifies the range
containing `key1`, which is again found the same way (see examples below).

[Updates required below]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tamird could you update here as well?

# Range-Spanning Binary Tree

[Remove this section I think.]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done #9643

@@ -903,6 +941,7 @@ expired, the command will be rejected by the replica.

# Splitting / Merging Ranges

[Accurate still? Needs review.]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we still discuss merges if we don't do them yet?

@@ -805,6 +839,7 @@ Future optimizations may include two-phase elections and quiescent ranges

# Range Leases

[Cleanup, add section on epoch-based range leases and motivation & link to range lease RFC]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spencerkimball to do range leases

@@ -840,6 +875,7 @@ offset.

## Relationship to Raft leadership

[Is this all still accurate?]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arjunravinarayan to do this raft leadership section

@@ -1091,6 +1136,8 @@ The gossip protocol itself contains two primary components:

# Node Accounting

[Replace with description of MVCC, store-level aggregation, recorder,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrtracy would be great if you could replace Node Accounting section with an Accounting...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like @tschottdorf is handling this bit.

@@ -1225,6 +1275,7 @@ it discovers differences, it reconfigures ranges in the same way
that it rebalances away from busy nodes, via special-case 1:1
split to a duplicate range comprising the new configuration.

[Rest of this document is tossable]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spencerkimball to delete tail of document.

@petermattis
Copy link
Collaborator

If everyone is making edits to this doc, the merge might be painful. What is the plan?

BramGruneir added a commit to BramGruneir/cockroach that referenced this pull request Sep 30, 2016
@spencerkimball
Copy link
Member Author

@petermattis just stay in your assigned section.

BramGruneir added a commit to BramGruneir/cockroach that referenced this pull request Sep 30, 2016
This adds a little bit on repair and rebalancing as well.

Part of cockroachdb#9634.
@spencerkimball spencerkimball deleted the spencerkimball/design-doc branch October 3, 2016 16:34
BramGruneir added a commit to BramGruneir/cockroach that referenced this pull request Oct 4, 2016
This adds a little bit on repair and rebalancing as well.

Part of cockroachdb#9634.
@@ -609,6 +638,7 @@ for further details.

# Node Storage

[Updates necessary below]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #9652

@@ -609,6 +638,7 @@ for further details.

# Node Storage

[Updates necessary below]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants