diff --git a/LICENSE b/LICENSE index 149adaa..da87601 100644 --- a/LICENSE +++ b/LICENSE @@ -1,6 +1,6 @@ BSD 3-Clause License -Copyright (c) 2018-2023, Łukasz Szeremeta +Copyright (c) 2018-2024, Łukasz Szeremeta All rights reserved. Redistribution and use in source and binary forms, with or without diff --git a/README.md b/README.md index 0323ca9..a76f90c 100644 --- a/README.md +++ b/README.md @@ -1,16 +1,13 @@ # YARS-PG grammar -[![Test and pre-release](https://github.com/lszeremeta/yarspg/actions/workflows/pre-release.yml/badge.svg)](https://github.com/lszeremeta/yarspg/actions/workflows/pre-release.yml) -[![DOI](https://zenodo.org/badge/161351716.svg)](https://zenodo.org/badge/latestdoi/161351716) +[![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/lszeremeta/yarspg/pre-release.yml?label=Test%20and%20pre-release)](https://github.com/lszeremeta/yarspg/actions/workflows/pre-release.yml) [![GitHub Release](https://img.shields.io/github/v/release/lszeremeta/yarspg)](https://github.com/lszeremeta/yarspg/releases/latest) [![DOI](https://zenodo.org/badge/161351716.svg)](https://zenodo.org/badge/latestdoi/161351716) -The YARS-PG serialization is a special version of [YARS](https://github.com/lszeremeta/yars) for property graphs. This serialization is specially designed for property graphs: +The YARS-PG serialization is designed for property graphs (but not only): * YARS-PG supports all the features allowed by the current database systems based on the property graph data model, * YARS-PG provides a simple syntax with a reduced number of extra characters, * YARS-PG is inspired by the syntax used by popular graph query languages (e.g. Cypher and GQL) to encode the structure of a property graph (i.e. nodes, edges, and properties). -The YARS-PG grammar is written in [ANTLR4](https://github.com/antlr/antlr4). If you prefer [Extended Backus-Naur Form (EBNF)](https://www.w3.org/TR/REC-xml/#sec-notation) notation, you can also see a preview version of [YARS-PG grammar in EBNF](https://github.com/lszeremeta/antlr-yarspg/blob/main/other-notations/YARSpg.ebnf). See [YARS-PG specification](https://lszeremeta.github.io/yarspg/index.html) or detailed [Serialization for Property Graphs presentation](https://www.researchgate.net/publication/340208659_Serialization_for_Property_Graphs) for more details. - -This project is based on the [ANTLR grammars-v4 project](https://github.com/antlr/grammars-v4). +The YARS-PG grammar is written in [ANTLR4](https://github.com/antlr/antlr4). If you prefer [Extended Backus-Naur Form (EBNF)](https://www.w3.org/TR/REC-xml/#sec-notation) notation, you can also see [YARS-PG grammar in EBNF](https://github.com/lszeremeta/antlr-yarspg/blob/main/other-notations/YARSpg.ebnf). You may also see YARS-PG [examples](https://github.com/lszeremeta/yarspg/tree/main/yarspg/examples). ## Parsers @@ -20,7 +17,7 @@ The YARS-PG parsers are available in the [parsers](https://github.com/lszeremeta If you want to help build YARS-PG parsers for other languages, please follow the [ANTLR4 documentation](https://github.com/antlr/antlr4/tree/dev/doc) and create a pull request. -See the ``README.md`` file in each parser directory for more details. Ready to use parsers are also available in the Assets section in [Releases](https://github.com/lszeremeta/yarspg/releases). +See the ``README.md`` file in each parser directory for more details. Ready-to-use parsers are also available in the Assets section in [Releases](https://github.com/lszeremeta/yarspg/releases). ## Testing grammar @@ -30,6 +27,8 @@ Run tests on files in ``yarspg/examples``: mvn clean test ``` +This project is based on the [ANTLR grammars-v4 project](https://github.com/antlr/grammars-v4). + ## Contribution Would you like to improve this project? Great! We are waiting for your help and suggestions. If you are new to open source contributions, read [How to Contribute to Open Source](https://opensource.guide/how-to-contribute/). diff --git a/docs/index.html b/docs/index.html deleted file mode 100644 index 85da3f5..0000000 --- a/docs/index.html +++ /dev/null @@ -1,995 +0,0 @@ - - - - - - YARS-PG 4.0 - - - - - -
-

- The YARS-PG serialization was designed to be simple, extensible and platform independent, and to support all - the - features provided by the current database systems based on the property graph data model. -

-
-
-

- This is very early draft of YARS-PG specification based on YARS-PG 4.0. -

-
-
-

Introduction

-

This document defines YARS-PG, a serialization for property graphs.

- -

The YARS-PG serialization supports all the features allowed by the current database systems based on the - property - graph data model, and can be adapted to work with various visualization software, database-driven systems - and - other graph-oriented tools.

-
-
-

YARS-PG Language

- The YARS-PG serialization contains node declarations and edge declaration (no order is required for them). -
-

Comment

-

An one-line comment allows to place additional information in the file that is not taken into - account during its processing.

- -
-
-

Prefix

-

A prefix declaration associates a prefix label with an IRI.

- -
-
-

Metadata

-

A document metadata declaration associates metadata key (IRI or QName) with metadata value - (string or IRI).

- -
-
-

Node

-

A node declaration begins with the node identifier, followed by an optional list of node labels and - optional node properties, optional list of graph names and optional annotation list.

- -
-
-

Edge

-

An edge declaration begins with the source node identifier, followed by relationship - identifier - (optional), a label, a set of properties (optional), target node identifier, optional list of graph - names and optional annotation list. Edge can be - directed (->) or undirected (-).

- -
-
-

Property

- A property is represented as a pair p:v, where p is the property label - and v the property value. A - property value could be atomic (e.g. String, Integer, Float, Null, Boolean) or complex (e.g. a list of - atomic values).

- -
-
-

Annotation

- An annotation is represented as a pair k:v, where k is the key (that - can be a string, QName or IRI) and v the value of attribute (that can be QName or IRI for QName - and IRI key and string for string key).

- -
-
-

Named Property Graph

-

A Named Property Graph declaration is represented as a list of graphs names.

-

Graph names can be added to node, edge and node schema declarations.

- -
-
-
-

YARS-PG Schema Language

-

An YARS-PG schemas may be used to determine the expected structure of a YARS-PG document.

-
-

Node schema

-

A node schema declaration begins with optional list of node labels, followed by optional properties of - the node with its value type, optional list of graph names and optional annotation list.

- -
-
-

Edge schema

-

An edge schema declaration begins with the node label, followed by an edge label, - a set of properties (optional) and target node label. Edge schema can be - directed (->) or undirected (-).

- -
-
-

Datatypes

-

YARS-PG supports several primitive and complex datatypes including String, - Integer, Date and List. Datatypes can be used to declare the - type - of expected value in schemas.

-
-

Primitive datatypes

-

The following primitive datatypes are supported: - Decimal, SmallInt, Integer, BigInt, - Float, Real, Double, Bool, Null, - String, Date, Time and DateTime.

- - - - -
-
-

Complex datatypes

-

The following complex datatypes are supported: - Set, List and Struct.

-

Set disallows duplicate elements and provided - the order of retrieval is not significant. List allows duplicate elements and provided - the order of retrieval is significant. Struct is a collection of name/value pairs. -

- - - - - -
-
-
-
-

Canonical form

-

A canonical version of YARS-PG must meet the list of conditions specified below.

- - - - -
-

Canonicalization

-

The following algorithm can be used to convert YARS-PG into a canonical form:

-
    -
  1. Remove all comments
  2. -
  3. Transform prefixes into full IRIs
  4. -
  5. Remove prefix declarations
  6. -
  7. Roll up multi-line declarations into one line declarations
  8. -
  9. Reorder declarations for sections in that order: metadata, node schemas, edge schemas, nodes, edges -
  10. -
  11. Remove all empty lines with only LF or CR
  12. -
  13. Remove all whitespaces (spaces U+0020 or tabs U+0009) between - serialization elements
  14. -
-
-
- -
-

YARS-PG Grammar

-

The grammar of YARS-PG is written in [[ANTLR4]]. We have also prepared a preview of the grammar in - [[EBNF-NOTATION]]. -

- -

A preview version of YARS-PG grammar in [[EBNF-NOTATION]] is presented below.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
[1]yarspg::= -statement* -
[2]statement::= -node -| edge -| prefix_directive -| metadata -| node_schema -| edge_schema -| section -
[3]prefix_directive::= -pname IRI -
[4]pname::= -":" ALNUM_PLUS ":" -
[5]pn_local::= -ALNUM_PLUS -
[6]metadata::= -"-" ((pn_local pname) | (IRI ":")) (STRING | IRI) -
[7]graph_name::= -STRING -
[8]annotation::= -string_annotation | rdf_annotation -
[9]string_annotation::= -STRING ":" STRING -
[10]rdf_annotation::= -((pn_local pname) | (IRI ":")) (STRING | IRI) -
[11]annotations_list::= -"+" annotation ("," annotation)* -
[12]props_list::= -"[" prop ("," prop)* "]" -
[13]graphs_list::= -"/" graph_name ("," graph_name)* "/" -
[14]node::= -"<" node_id ">" ("{" node_label ("," node_label)* "}")? props_list? graphs_list? annotations_list? -
[15]edge::= -directed | undirected -
[16]section::= -"%" SECTION_NAME -
[17]directed::= -"(" -node_id -")" -"-" -("<" edge_id ">")? -"{" -edge_label -"}" -props_list? -"->" -"(" -node_id -")" -graphs_list? -annotations_list? -
[18]undirected::= -"(" -node_id -")" -"-" -("<" edge_id ">")? -"{" -edge_label -"}" -props_list? -"-" -"(" -node_id -")" -graphs_list? -annotations_list? -
[19]node_id::= -STRING -
[20]node_label::= -STRING -
[21]prop::= -key ":" value -
[22]edge_id::= -STRING -
[23]edge_label::= -STRING -
[24]key::= -STRING -
[25]value::= -primitive_value | complex_value -
[26]primitive_value::= -STRING | DATETYPE | NUMBER | BOOL | "null" -
[27]complex_value::= -set | list | struct -
[28]set::= -"{" -(primitive_value | set) -("," (primitive_value | set))* -"}" -
[29]list::= -"[" -(primitive_value | list) -("," (primitive_value | list))* -"]" -
[30]struct::= -"{" -key -":" -(primitive_value | struct) -("," key ":" (primitive_value | struct))* -"}" -
[31]node_schema::= -"S" -("{" node_label ("," node_label)* "}")? -props_list_schema? -graphs_list? -annotations_list? -
[32]props_list_schema::= -"[" prop_schema ("," prop_schema)* "]" -
[33]prop_schema::= -key ":" value_schema -
[34]value_schema::= -primitive_value_schema | complex_value_schema -
[35]primitive_value_schema::= -"Decimal" -| "SmallInt" -| "Integer" -| "BigInt" -| "Float" -| "Real" -| "Double" -| "Bool" -| "Null" -| "String" -| "Date" -| "DateTime" -| "Time" -
[36]complex_value_schema::= -set_schema | list_schema | struct_schema -
[37]set_schema::= -"Set" -"(" -(primitive_value_schema | set_schema) -")" -
[38]list_schema::= -"List" -"(" -(primitive_value_schema | list_schema) -")" -
[39]struct_schema::= -"Struct" -"(" -(primitive_value_schema | struct_schema) -")" -
[40]edge_schema::= -directed_schema | undirected_schema -
[41]directed_schema::= -"S" -("(" node_label ")")? -"-" -"{" -edge_label -"}" -props_list_schema? -"->" -("(" node_label ")")? -
[42]undirected_schema::= -"S" -("(" node_label ")")? -"-" -"{" -edge_label -"}" -props_list_schema? -"-" -("(" node_label ")")? -
[43]SECTION_NAME::= -"METADATA" -| "NODE SCHEMAS" -| "EDGE SCHEMAS" -| "NODES" -| "EDGES" -
[44]COMMENT::= -"#" ([^#xd#xa#xc])* -
[45]STRING::= -STRING_LITERAL_QUOTE -
[46]NUMBER::= -SIGN? ([0-9])+ "."? ([0-9])* -
[47]BOOL::= -"true" | "false" -
[53]DATETYPE::= -DATETIME | DATE | TIME -
[54]DATE::= -[0-9] -[0-9] -[0-9] -[0-9] -"-" -[0-9] -[0-9] -"-" -[0-9] -[0-9] -
[55]TIME::= -[0-9] -[0-9] -":" -[0-9] -[0-9] -":" -[0-9] -[0-9] -TIMEZONE? -
[56]TIMEZONE::= -SIGN? [0-9] [0-9] ":" [0-9] [0-9] -
[57]DATETIME::= -DATE "T" TIME -
[58]SIGN::= -"+" | "-" -
[48]STRING_LITERAL_QUOTE::= -'"' ([^"#xd#xa] | "'" | '"')* '"' -
[49]ALNUM_PLUS::= -PN_CHARS_BASE ((PN_CHARS | ".")* PN_CHARS)? -
[50]IRI::= -"<" -(PN_CHARS | "." | ":" | "/" | "\" | "#" | "@" | "%" | "&" | UCHAR)* -">" -
[51]PN_CHARS::= -PN_CHARS_U | [-0-9#xB7#x0300-#x036F#x203F-#x2040] -
[52]PN_CHARS_U::= -PN_CHARS_BASE | "_" -
[59]UCHAR::= -("u" | "U" HEX HEX HEX HEX) HEX HEX HEX HEX -
[60]PN_CHARS_BASE::= -[A-Za-z0-9#xC0-#xD6#xD8-#xF6#xF8-#x2FF#x370-#x37D#x37F-#x1FFF#x200C-#x200D#x2070-#x218F#x2C00-#x2FEF#x3001-#xD7FF#xF900-#xFDCF#xFDF0-#xFFFD] -
[61]HEX::= -[0-9A-Fa-f] -
[62]WS::= -([#x20#x9#xa])+ -
-
-
-

Parsers

-

Parsers for different languages for the current grammar version will be presented later.

-
- - - diff --git a/other-notations/YARSpg.ebnf b/other-notations/YARSpg.ebnf index 096b96d..6031a77 100644 --- a/other-notations/YARSpg.ebnf +++ b/other-notations/YARSpg.ebnf @@ -1,14 +1,12 @@ /* YARS-PG grammar in Extended Backus-Naur Form (EBNF) notation - (preview version) Based on YARS-PG grammar in ANTLR4 See more at: https://github.com/lszeremeta/antlr-yarspg [The "BSD licence"] - Copyright (c) 2018-2021, Łukasz Szeremeta (@ University of Bialystok, https://github.com/lszeremeta) - Copyright (c) 2018-2019, Dominik Tomaszuk (@ University of Bialystok, http://www.uwb.edu.pl/) + Copyright (c) 2018-2024, Łukasz Szeremeta (@ University of Bialystok, https://github.com/lszeremeta) All rights reserved. Special thanks to Gregg Kellogg (greggkellogg.net) for valuable diff --git a/yarspg/YARSpg.g4 b/yarspg/YARSpg.g4 index 7633af6..ab1ffd4 100644 --- a/yarspg/YARSpg.g4 +++ b/yarspg/YARSpg.g4 @@ -1,14 +1,8 @@ /* [The "BSD licence"] - Copyright (c) 2018-2021, Łukasz Szeremeta (@ University of Bialystok, https://github.com/lszeremeta) - Copyright (c) 2018, Dominik Tomaszuk (@ University of Bialystok, http://www.uwb.edu.pl/) - Copyright (c) 2018, Karol Litman (@ University of Bialystok, http://www.uwb.edu.pl/) + Copyright (c) 2018-2024, Łukasz Szeremeta (@ University of Bialystok, https://github.com/lszeremeta) All rights reserved. - Based on YARS grammar - (https://github.com/lszeremeta/antlr-yars/blob/master/yars/YARS.g4) - distributed under BSD licence. - Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: