GitHub - robturner/incubator-mrql: Mirror of Apache MRQL (Incubating)

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
bin		bin
bsp		bsp
conf		conf
core		core
dist		dist
flink		flink
gen		gen
mapreduce		mapreduce
queries		queries
spark		spark
src		src
tests		tests
.gitignore		.gitignore
DISCLAIMER		DISCLAIMER
LICENSE		LICENSE
NOTICE		NOTICE
README		README
RELEASE_NOTES		RELEASE_NOTES
pom.xml		pom.xml

Repository files navigation

***************************************************************************

 Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership.  The ASF licenses this file
 to you under the Apache License, Version 2.0 (the
 "License"); you may not use this file except in compliance
 with the License.  You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.

***************************************************************************

Apache MRQL 0.9.4-incubating
============================

Apache MRQL (pronounced miracle) is a query processing and optimization
system for large-scale, distributed data analysis. MRQL (the MapReduce
Query Language) is an SQL-like query language for large-scale data
analysis on a cluster of computers. The MRQL query processing system
can evaluate MRQL queries in four modes:

* in Map-Reduce mode using Apache Hadoop,
* in BSP mode (Bulk Synchronous Parallel mode) using Apache Hama,
* in Spark mode using Apache Spark,
* in Flink mode using Apache Flink.

The MRQL query language is powerful enough to express most common data
analysis tasks over many forms of raw in-situ data, such as XML and
JSON documents, binary files, and CSV documents. MRQL is more powerful
than other current high-level MapReduce languages, such as Hive and
PigLatin, since it can operate on more complex data and supports more
powerful query constructs, thus eliminating the need for using
explicit MapReduce code. With MRQL, users are able to express complex
data analysis tasks, such as PageRank, k-means clustering, matrix
factorization, etc, using SQL-like queries exclusively, while the MRQL
query processing system is able to compile these queries to efficient
Java code.

General Info
============

For the latest information about MRQL, please visit our website at:

   http://mrql.incubator.apache.org/

and our wiki, at:

   http://wiki.apache.org/mrql/

Getting Started
===============

Installation instructions and a quick tutorial:

   http://wiki.apache.org/mrql/GettingStarted

To build MRQL using maven, use 'mvn clean install'. To validate the
installation use 'mvn -DskipTests=false clean install', which runs the
queries in 'tests/queries' in memory, local Hadoop mode, local Hama
mode, local Spark mode, and local Flink mode.

Useful mailing lists
====================

1. [email protected] - To discuss and ask usage questions. Send an
   empty email to [email protected] in order to subscribe
   to this mailing list.

2. [email protected] - For discussions about code, design and features.
   Send an empty email to [email protected] in order to
   subscribe to this mailing list.

3. [email protected] - In order to monitor commits to the source
   repository. Send an empty email to [email protected]
   in order to subscribe to this mailing list.