Skip to content

Commit

Permalink
expand file glob within prettier (#803)
Browse files Browse the repository at this point in the history
'**' pattern is not supported to some of the shells including the one we
use in CI.
  • Loading branch information
QP Hou authored Aug 1, 2021
1 parent a4941ee commit 3eac2e6
Show file tree
Hide file tree
Showing 4 changed files with 28 additions and 28 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ jobs:
# if you encounter error, try rerun the command below with --write instead of --check
# and commit the changes
npx [email protected] --check \
{ballista,datafusion,datafusion-examples,docs,python}/**/*.md \
'{ballista,datafusion,datafusion-examples,docs,python}/**/*.md' \
README.md \
DEVELOPERS.md \
ballista/**/*.{ts,tsx}
'ballista/**/*.{ts,tsx}'
20 changes: 10 additions & 10 deletions ballista/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@

# Ballista: Distributed Compute with Apache Arrow and DataFusion

Ballista is a distributed compute platform primarily implemented in Rust, and powered by Apache Arrow and
DataFusion. It is built on an architecture that allows other programming languages (such as Python, C++, and
Ballista is a distributed compute platform primarily implemented in Rust, and powered by Apache Arrow and
DataFusion. It is built on an architecture that allows other programming languages (such as Python, C++, and
Java) to be supported as first-class citizens without paying a penalty for serialization costs.

The foundational technologies in Ballista are:
Expand All @@ -37,23 +37,23 @@ redundancy in the case of a scheduler failing.

# Getting Started

Fully working examples are available. Refer to the [Ballista Examples README](../ballista-examples/README.md) for
Fully working examples are available. Refer to the [Ballista Examples README](../ballista-examples/README.md) for
more information.

## Distributed Scheduler Overview

Ballista uses the DataFusion query execution framework to create a physical plan and then transforms it into a
Ballista uses the DataFusion query execution framework to create a physical plan and then transforms it into a
distributed physical plan by breaking the query down into stages whenever the partitioning scheme changes.

Specifically, any `RepartitionExec` operator is replaced with an `UnresolvedShuffleExec` and the child operator
Specifically, any `RepartitionExec` operator is replaced with an `UnresolvedShuffleExec` and the child operator
of the repartition operator is wrapped in a `ShuffleWriterExec` operator and scheduled for execution.

Each executor polls the scheduler for the next task to run. Tasks are currently always `ShuffleWriterExec` operators
and each task represents one *input* partition that will be executed. The resulting batches are repartitioned
according to the shuffle partitioning scheme and each *output* partition is streamed to disk in Arrow IPC format.
Each executor polls the scheduler for the next task to run. Tasks are currently always `ShuffleWriterExec` operators
and each task represents one _input_ partition that will be executed. The resulting batches are repartitioned
according to the shuffle partitioning scheme and each _output_ partition is streamed to disk in Arrow IPC format.

The scheduler will replace `UnresolvedShuffleExec` operators with `ShuffleReaderExec` operators once all shuffle
tasks have completed. The `ShuffleReaderExec` operator connects to other executors as required using the Flight
The scheduler will replace `UnresolvedShuffleExec` operators with `ShuffleReaderExec` operators once all shuffle
tasks have completed. The `ShuffleReaderExec` operator connects to other executors as required using the Flight
interface, and streams the shuffle IPC files.

# How does this compare to Apache Spark?
Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/src/distributed/docker-compose.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ demonstrates how to start a cluster using a single process that acts as both a s
volume mounted into the container so that Ballista can access the host file system.

```yaml
version: '2.2'
version: "2.2"
services:
etcd:
image: quay.io/coreos/etcd:v3.4.9
Expand Down
30 changes: 15 additions & 15 deletions docs/user-guide/src/distributed/kubernetes.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,16 +129,16 @@ spec:
ballista-cluster: ballista
spec:
containers:
- name: ballista-scheduler
image: <your-image>
command: ["/scheduler"]
args: ["--bind-port=50050"]
ports:
- containerPort: 50050
name: flight
volumeMounts:
- mountPath: /mnt
name: data
- name: ballista-scheduler
image: <your-image>
command: ["/scheduler"]
args: ["--bind-port=50050"]
ports:
- containerPort: 50050
name: flight
volumeMounts:
- mountPath: /mnt
name: data
volumes:
- name: data
persistentVolumeClaim:
Expand Down Expand Up @@ -245,10 +245,10 @@ spec:
minReplicaCount: 0
maxReplicaCount: 5
triggers:
- type: external
metadata:
# Change this DNS if the scheduler isn't deployed in the "default" namespace
scalerAddress: ballista-scheduler.default.svc.cluster.local:50050
- type: external
metadata:
# Change this DNS if the scheduler isn't deployed in the "default" namespace
scalerAddress: ballista-scheduler.default.svc.cluster.local:50050
```
And then deploy it into the cluster:
Expand All @@ -261,4 +261,4 @@ If the cluster is inactive, Keda will now scale the number of executors down to
you launch a query. Please note that Keda will perform a scan once every 30 seconds, so it might take a bit to
scale the executors.

Please visit Keda's [documentation page](https://keda.sh/docs/2.3/concepts/scaling-deployments/) for more information.
Please visit Keda's [documentation page](https://keda.sh/docs/2.3/concepts/scaling-deployments/) for more information.

0 comments on commit 3eac2e6

Please sign in to comment.