-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON parsing support #140
Comments
Excellent work so far! Quick comments: Does JSON support Why wrap in
|
I actually started out with Since JSON is based on Javascript, the only difference between int/double is whether there is a decimal. In Javascript these are both just the same "Number" type. { "int": 1 }
// vs
{ "float": 1.0 } In other strongly typed languages it's common to provide the distinction, because if you only offered the "AsDouble" function the developer will have to constantly parse/round the double's to int's everywhere which is annoying and requires two rounds of parsing. Regarding JsonObject / JsonArray, I did this because most other Json libraries do it this way, so we will have a better mapping between our types and the underlying languages - but also because we may eventually want to provide additional properties which only exist on the JsonArray / JsonObject classes. However, I don't mind either way. If you prefer we drop those derived classes it will be fine by me. |
I have updated the issue with the remaining languages and incorporated your feedback about removing the derived classes. I have kept the Int/double representations, because all of our underlying implementations support this except for swift. I am not really sure how to go about implementing this (as I'm currently just struggling to add a ToLower/ToUpper function...) but if you have some suggestions on where I should start I could try. |
See the above commit. It defines the API in I chose naming closer to .NET's. I don't have a strong opinion on The definitions in |
In the event that it's helpful, I also took a stab at writing a Json parser in Fusion. I wasn't sure how hard it would be or if it would be desirable over relying on each language's built-in parser. public enum JsonNodeType
{
Null,
Bool,
Array,
Object,
Number,
String
}
enum JsonToken
{
None,
CurlyOpen,
CurlyClose,
SquareOpen,
SquareClose,
Colon,
Comma,
String,
Number,
Bool,
Null
}
public class JsonParseException : Exception
{ }
public class JsonNode
{
JsonNodeType Type = JsonNodeType.Null;
Dictionary<string(), JsonNode#>() ObjectValue;
List<JsonNode#>() ArrayValue;
string() StringValue;
double NumberValue;
bool BoolValue;
/// Get the type of this node, such as string, object, array, etc.
/// You should use this function and then call the corresponding
/// AsObject, AsArray, AsString, etc. functions to get the actual
/// parsed json information.
public JsonNodeType GetType()
{
return Type;
}
/// Check if the JSON value is null.
public bool IsNull()
{
return Type == JsonNodeType.Null;
}
/// Reinterpret a JSON value as an object. Throws exception if the value type was not an object.
public Dictionary<string(), JsonNode#> AsObject() throws Exception
{
if (Type != JsonNodeType.Object)
{
throw Exception("Cannot call AsObject on JsonNode which is not an object.");
}
return ObjectValue;
}
/// Reinterpret a JSON value as an array. Throws exception if the value type was not an array.
public List<JsonNode#> AsArray() throws Exception
{
if (Type != JsonNodeType.Array)
{
throw Exception("Cannot call AsArray on JsonNode which is not an array.");
}
return ArrayValue;
}
/// Reinterpret a JSON value as a number. Throws exception if the value type was not a double.
public double AsNumber() throws Exception
{
if (Type != JsonNodeType.Number)
{
throw Exception("Cannot call AsNumber on JsonNode which is not a number.");
}
return NumberValue;
}
/// Reinterpret a JSON value as a boolean. Throws exception if the value type was not a boolean.
public bool AsBool() throws Exception
{
if (Type != JsonNodeType.Bool)
{
throw Exception("Cannot call AsBool on JsonNode which is not a boolean.");
}
return BoolValue;
}
/// Reinterpret a JSON value as a string. Throws exception if the value type was not a string.
public string AsString() throws Exception
{
if (Type != JsonNodeType.String)
{
throw Exception("Cannot call AsString on JsonNode which is not a string.");
}
return StringValue;
}
public static JsonNode# Parse(string text) throws Exception, JsonParseException
{
JsonParser# parser = new JsonParser();
parser.Load(text);
return parser.ParseValue();
}
internal void InitBool!(bool value) throws JsonParseException
{
if (Type != JsonNodeType.Null)
{
throw JsonParseException("Cannot call InitBool on JsonNode which is not null.");
}
Type = JsonNodeType.Bool;
BoolValue = value;
}
internal void InitArray!() throws JsonParseException
{
if (Type != JsonNodeType.Null)
{
throw JsonParseException("Cannot call InitArray on JsonNode which is not null.");
}
Type = JsonNodeType.Array;
}
internal void AddArrayChild!(JsonNode# child) throws JsonParseException
{
if (Type != JsonNodeType.Array)
{
throw JsonParseException("Cannot call AddArrayChild on JsonNode which is not an array.");
}
ArrayValue.Add(child);
}
internal void InitObject!() throws JsonParseException
{
if (Type != JsonNodeType.Null)
{
throw JsonParseException("Cannot call InitObject on JsonNode which is not null.");
}
Type = JsonNodeType.Object;
}
internal void AddObjectChild!(string key, JsonNode# child) throws JsonParseException
{
if (Type != JsonNodeType.Object)
{
throw JsonParseException("Cannot call AddObjectChild on JsonNode which is not an object.");
}
ObjectValue[key] = child;
}
internal void InitNumber!(double value) throws JsonParseException
{
if (Type != JsonNodeType.Null)
{
throw JsonParseException("Cannot call InitNumber on JsonNode which is not null.");
}
Type = JsonNodeType.Number;
NumberValue = value;
}
internal void InitString!(string value) throws JsonParseException
{
if (Type != JsonNodeType.Null)
{
throw JsonParseException("Cannot call InitString on JsonNode which is not null.");
}
Type = JsonNodeType.String;
StringValue = value;
}
}
class StringAppendable
{
StringWriter() builder;
TextWriter! writer;
bool initialised;
public void Clear!()
{
builder.Clear();
}
public void WriteChar!(int c)
{
if (!initialised)
{
writer = builder;
initialised = true;
}
writer.WriteChar(c);
}
public string() ToString()
{
return builder.ToString();
}
}
// https://github.com/gering/Tiny-JSON/blob/master/Tiny-JSON/Tiny-JSON/JsonParser.cs
class JsonParser
{
string() text = "";
int position = 0;
StringAppendable() builder;
public void Load!(string text)
{
this.text = text;
this.position = 0;
}
public bool EndReached()
{
return position >= text.Length;
}
public string() ReadN!(int n) throws JsonParseException
{
if (position + n > text.Length)
{
throw JsonParseException("Unexpected end of input");
}
string() result = text.Substring(position, n);
position += n;
return result;
}
public int Read!()
{
if (position >= text.Length)
{
return -1;
}
int c = text[position];
position++;
return c;
}
public int Peek()
{
if (position >= text.Length)
{
return -1;
}
return text[position];
}
public bool PeekWhitespace()
{
int c = Peek();
return c == ' ' || c == '\t' || c == '\n' || c == '\r';
}
public bool PeekWordbreak()
{
int c = Peek();
return c == ' ' || c == ',' || c == ':' || c == '\"' || c == '{' || c == '}'
|| c == '[' || c == ']' || c == '\t' || c == '\n' || c == '\r' || c == '/';
}
JsonToken PeekToken!() {
EatWhitespace();
if (EndReached()) return JsonToken.None;
switch (Peek()) {
case '{':
return JsonToken.CurlyOpen;
case '}':
return JsonToken.CurlyClose;
case '[':
return JsonToken.SquareOpen;
case ']':
return JsonToken.SquareClose;
case ',':
return JsonToken.Comma;
case '"':
return JsonToken.String;
case ':':
return JsonToken.Colon;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
case '-':
return JsonToken.Number;
case 't':
case 'f':
return JsonToken.Bool;
case 'n':
return JsonToken.Null;
case '/':
// ignore / skip past line and blockcomments
Read(); // skip the first /
if (Peek() == '/') { // line comment, read to next \n
while (!EndReached() && Peek() != '\n') {
Read();
}
return PeekToken();
} else if (Peek() == '*') { // block comment, read to */
Read(); // skip the *
while (!EndReached()) {
if (Read() == '*' && Peek() == '/') {
Read(); // skip the /
return PeekToken();
}
}
}
return JsonToken.None;
default:
return JsonToken.None;
}
}
public void EatWhitespace!()
{
while (!EndReached() && PeekWhitespace())
{
Read();
}
}
public string() ReadWord!()
{
builder.Clear();
while (!EndReached() && !PeekWordbreak())
{
builder.WriteChar(Read());
}
return builder.ToString();
}
public JsonNode# ParseNull!() throws JsonParseException
{
ReadWord();
JsonNode# node = new JsonNode();
return node;
}
public JsonNode# ParseBool!() throws JsonParseException
{
string() boolValue = ReadWord();
if (boolValue == "true") {
JsonNode# node = new JsonNode();
node.InitBool(true);
return node;
} else if (boolValue == "false") {
JsonNode# node = new JsonNode();
node.InitBool(false);
return node;
} else {
throw JsonParseException("Invalid boolean");
}
}
public JsonNode# ParseNumber!() throws JsonParseException
{
double d;
if (d.TryParse(ReadWord())) {
JsonNode# node = new JsonNode();
node.InitNumber(d);
return node;
}
throw JsonParseException("Invalid number");
}
public JsonNode# ParseString!() throws JsonParseException
{
builder.Clear();
Read(); // ditch opening quote
while (true) {
if (EndReached()) {
throw JsonParseException("Unterminated string");
}
int c = Read();
switch (c) {
case '"':
JsonNode# node = new JsonNode();
node.InitString(builder.ToString());
return node;
case '\\':
if (EndReached()) {
throw JsonParseException("Unterminated string");
}
c = Read();
switch (c) {
case '"':
case '\\':
case '/':
builder.WriteChar(c);
break;
case 'b':
builder.WriteChar(0x0008); // backspace
break;
case 'f':
builder.WriteChar(0x000C); // form feed
break;
case 'n':
builder.WriteChar('\n');
break;
case 'r':
builder.WriteChar('\r');
break;
case 't':
builder.WriteChar('\t');
break;
case 'u':
int i;
if (i.TryParse(ReadN(4), 16)) {
builder.WriteChar(i);
} else {
throw JsonParseException("Invalid unicode escape");
}
break;
}
break;
default:
builder.WriteChar(c);
break;
}
}
}
public JsonNode# ParseObject!() throws Exception, JsonParseException
{
Read(); // ditch opening brace
JsonNode# node = new JsonNode();
node.InitObject();
while (true) {
switch (PeekToken()) {
case JsonToken.None:
throw JsonParseException("Unterminated object");
case JsonToken.Comma:
Read(); // ditch comma
continue;
case JsonToken.CurlyClose:
Read(); // ditch closing brace
return node;
default:
JsonNode# name = ParseString();
if (PeekToken() != JsonToken.Colon) throw JsonParseException("Expected colon");
Read(); // ditch the colon
node.AddObjectChild(name.AsString(), ParseValue());
break;
}
}
}
public JsonNode# ParseArray!() throws Exception, JsonParseException
{
Read(); // ditch opening brace
JsonNode# node = new JsonNode();
node.InitArray();
bool expectComma = false;
while (true) {
switch (PeekToken()) {
case JsonToken.None:
throw JsonParseException("Unterminated array");
case JsonToken.Comma:
if (!expectComma) {
throw JsonParseException("Unexpected comma in array");
}
expectComma = false;
Read(); // ditch comma
continue;
case JsonToken.SquareClose:
Read(); // ditch closing brace
return node;
default:
if (expectComma) {
throw JsonParseException("Expected comma");
}
expectComma = true;
node.AddArrayChild(ParseValue());
break;
}
}
}
public JsonNode# ParseValue!() throws Exception, JsonParseException
{
switch (PeekToken()) {
case JsonToken.String:
return ParseString();
case JsonToken.Number:
return ParseNumber();
case JsonToken.Bool:
return ParseBool();
case JsonToken.Null:
return ParseNull();
case JsonToken.CurlyOpen:
return ParseObject();
case JsonToken.SquareOpen:
return ParseArray();
default:
throw JsonParseException("Invalid token");
}
}
} |
I don't mind, the only C/C++ parsers I have experience with is nholman and sheredom but as long as it gets the job done. What are your thoughts in general about whether we should write stdlib libraries in Fusion (like above) vs. relying on language specific ones? |
What Fusion emits needs to feel like written by an experienced programmer directly in the target language. |
Turns out JSON parsing is not that hard. Here's how I would do it: public abstract class JsonElement
{
public virtual bool IsObject() => false;
public virtual bool IsArray() => false;
public virtual bool IsString() => false;
public virtual bool IsNumber() => false;
public virtual bool IsBoolean() => false;
public virtual bool IsTrue() => false;
public virtual bool IsNull() => false;
public virtual Dictionary<string(), JsonElement#> GetObject()
{
assert false;
}
public virtual List<JsonElement#> GetArray()
{
assert false;
}
public virtual string GetString()
{
assert false;
}
public virtual double GetNumber()
{
assert false;
}
public static JsonElement#? TryParse(string s)
{
JsonParser() parser;
return parser.TryParse(s);
}
}
class JsonObject : JsonElement
{
internal Dictionary<string(), JsonElement#>() Value;
public override bool IsObject() => true;
public override Dictionary<string(), JsonElement#> GetObject() => Value;
}
class JsonArray : JsonElement
{
internal List<JsonElement#>() Value;
public override bool IsArray() => true;
public override List<JsonElement#> GetArray() => Value;
}
class JsonString : JsonElement
{
internal string() Value;
public override bool IsString() => true;
public override string GetString() => Value;
}
class JsonNumber : JsonElement
{
internal double Value;
public override bool IsNumber() => true;
public override double GetNumber() => Value;
}
class JsonBoolean : JsonElement
{
public override bool IsBoolean() => true;
}
class JsonTrue : JsonBoolean
{
public override bool IsTrue() => true;
}
class JsonNull : JsonElement
{
public override bool IsNull() => true;
}
class JsonParser
{
string Input;
int Offset;
int InputLength;
bool SkipWhitespace!()
{
while (Offset < InputLength) {
switch (Input[Offset]) {
case '\t':
case '\n':
case '\r':
case ' ':
break;
default:
return true;
}
Offset++;
}
return false;
}
JsonObject#? ParseObject!()
{
Offset++;
if (!SkipWhitespace())
return null;
JsonObject# result = new JsonObject();
if (Input[Offset] == '}') {
Offset++;
return result;
}
while (Input[Offset] == '"') {
JsonString#? key = ParseString();
if (key == null || !SkipWhitespace() || Input[Offset] != ':')
return null;
Offset++;
JsonElement#? value = ParseWhitespaceAndElement();
if (value == null || !SkipWhitespace())
return null;
switch (Input[Offset]) {
case ',':
break;
case '}':
Offset++;
return result;
default:
return null;
}
Offset++;
if (!SkipWhitespace())
return null;
result.Value[key.Value] = value;
}
return null;
}
JsonArray#? ParseArray!()
{
Offset++;
if (!SkipWhitespace())
return null;
JsonArray# result = new JsonArray();
if (Input[Offset] == ']') {
Offset++;
return result;
}
for (;;) {
JsonElement#? element = ParseElement();
if (element == null || !SkipWhitespace())
return null;
switch (Input[Offset]) {
case ',':
break;
case ']':
Offset++;
return result;
default:
return null;
}
Offset++;
if (!SkipWhitespace())
return null;
result.Value.Add(element);
}
}
JsonString#? ParseString!()
{
Offset++;
StringWriter() result;
int startOffset = 0;
while (Offset < InputLength) {
switch (Input[Offset]) {
case 0:
case 1:
case 2:
case 3:
case 4:
case 5:
case 6:
case 7:
case 8:
case 9:
case 10:
case 11:
case 12:
case 13:
case 14:
case 15:
case 16:
case 17:
case 18:
case 19:
case 20:
case 21:
case 22:
case 23:
case 24:
case 25:
case 26:
case 27:
case 28:
case 29:
case 30:
case 31:
return null;
case '"':
result.Write(Input.Substring(startOffset, Offset++ - startOffset));
return new JsonString { Value = result.ToString() };
case '\\':
result.Write(Input.Substring(startOffset, Offset++ - startOffset));
if (Offset >= InputLength)
return null;
switch (Input[Offset]) {
case '"':
case '\\':
case '/':
startOffset = Offset++;
continue;
case 'b':
result.WriteChar(8);
break;
case 'f':
result.WriteChar(12);
break;
case 'n':
result.WriteChar('\n');
break;
case 'r':
result.WriteChar('\r');
break;
case 't':
result.WriteChar('\t');
break;
case 'u':
if (Offset + 5 >= InputLength)
return null;
int c;
if (!c.TryParse(Input.Substring(Offset + 1, 4), 16))
return null;
result.WriteCodePoint(c);
Offset += 4;
break;
default:
return null;
}
startOffset = ++Offset;
break;
default:
Offset++;
break;
}
}
return null;
}
bool SeeDigit() => Offset < InputLength && Input[Offset] >= '0' && Input[Offset] <= '9';
void ParseDigits!()
{
while (SeeDigit())
Offset++;
}
JsonNumber#? ParseNumber!()
{
int startOffset = Offset;
if (Input[Offset] == '-')
Offset++;
if (!SeeDigit())
return null;
if (Input[Offset++] > '0')
ParseDigits();
if (Offset < InputLength && Input[Offset] == '.') {
Offset++;
if (!SeeDigit())
return null;
ParseDigits();
}
if (Offset < InputLength && (Input[Offset] | 0x20) == 'e') {
if (++Offset < InputLength && (Input[Offset] == '+' || Input[Offset] == '-'))
Offset++;
if (!SeeDigit())
return null;
ParseDigits();
}
double d;
if (!d.TryParse(Input.Substring(startOffset, Offset - startOffset)))
return null;
return new JsonNumber { Value = d };
}
bool ParseKeyword!(string s)
{
foreach (int c in s) {
if (++Offset >= InputLength || Input[Offset] != c)
return false;
}
Offset++;
return true;
}
JsonElement#? ParseElement!()
{
switch (Input[Offset]) {
case '{':
return ParseObject();
case '[':
return ParseArray();
case '"':
return ParseString();
case '-':
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
return ParseNumber();
case 't':
return ParseKeyword("rue") ? new JsonTrue() : null;
case 'f':
return ParseKeyword("alse") ? new JsonBoolean() : null;
case 'n':
return ParseKeyword("ull") ? new JsonNull() : null;
default:
return null;
}
}
JsonElement#? ParseWhitespaceAndElement!() => SkipWhitespace() ? ParseElement() : null;
internal JsonElement#? TryParse!(string s)
{
Input = s;
Offset = 0;
InputLength = s.Length;
JsonElement#? result = ParseWhitespaceAndElement();
return SkipWhitespace() ? null : result;
}
} (NOT TESTED YET!) I will change the API to the above, and probably emit this pure-Fusion implementation for C and C++ instead of adding an extra dependency and converting the results. |
I think it is a good idea if we can start expanding the standard library like this with pure fusion code, rather than what we have via string building. Ideally it would very easy to add new functionality written in Fusion - maybe a new folder full of fusion code which if referenced automatically gets transpiled in. |
Current status:
Open topics:
Fusion started as a language for operating on raw bytes, much like assembly language. Strings, collections and regexes are all later additions. Can you give some examples on what could land in the pure-Fusion standard library? Perhaps it's better done as packages than a standard library? |
Re. packages that could be a good idea, but in this case, there needs to be a way for people to depend on / download / and update packages. This seems like a lot more work. |
I tried JsonElement in master, I guess the pure fusion version is not automatically emitted for C++ yet, is this planned or is it expected that I should copy that file in when compiling for C++ only? |
Pure-Fusion implementations are not yet built into I'm going to make a release of |
Following the pattern of IntTryParse, I propose a new JsonNode class containing a
TryParse(string)
method.The instance/non-static members will have information about the type of node that was parsed, and any children.
I do not propose to add json writing at this time, usually that's more useful when you have serialization/deserialization to class support, I think for manual json writing, string interpolation will provide an acceptable but less-than-ideal solution.
I also suggest we allow conversions to more specific derived types.
I have followed
sheredom/json.h
api as a base line so that we can fully support C/C++For example:
Some of our underlying json API's provide a "parse or throw" function, and some provide a "try parse or null" function, which is why I propose we expose both Parse and TryParse functions. In the case where the underlying implementation throws, our Parse method will call it directly and our TryParse method will wrap it in exception handling. In the case where the underlying implementation doesn't throw but provides an error message, we will directly call it with TryParse but check for the presence of the error message and explicitly throw in Parse - thereby providing a consistent api across every underlying implementation.
For C/C++
We would need to emit
sheredom/json.h
ahead of our fut implementation in the output header.We would use the following function to parse a json string:
json_parse_flags_e::json_parse_flags_allow_json5
In the returned
json_parse_result_s
, there is an error property - so we could also expose a "Parse" and "TryParse" varient, where the former throws if we wish. Injson_value_s
there is a type property which we can map to JsonNodeTypeThe following functions can be used to cast the
json_value_s
to a more specific type (in the same way we exposed our api).For C#
There is built-in json parsing in net5.0 and greater (System.Text.Json). For other TFM's (eg. net48), you can add this as a nuget package.
The JsonDocument.RootElement here becomes our first JsonNode, and it provides all the functions we need to map to our type.
For Js/Ts
Whether running within browser or nodejs,
JSON.parse()
will turn a string into a dynamic object.The AsString, AsInt, etc methods will likely be a no-op, and pass through the underlying javascript object, but we will need to provide some utility on top of this object to check the type. We may want to add a guard/throw to our AsType methods, if there is a type mismatch - to help prevent hard to diagnose runtime errors.
For D
There is native support for json in the D standard library: https://dlang.org/phobos/std_json.html
The JSONValue.type() property (https://dlang.org/phobos/std_json.html#.JSONType) will map nicely to JsonNodeType, and there are a variety of fields which provide the parsed results, such as:
Conveniently these properties will throw by default if the type does not match, which aligns with our proposed API.
OpenCL
I don't think we can / should support this platform. Using JsonNode should result in a compiler error.
Java
There are libraries we could pull in for Java, but I'd rather actually pull in this single file lib here to keep external dependencies to a minimum: https://github.com/mitchhentges/json-parse/blob/master/src/main/java/ca/fuzzlesoft/JsonParse.java - probably as an internal/private member of JsonNode so it's not directly exposed in fut or available to call directly in the resulting java code.
There is a static parse method returning Object.
Object will be of type
Map<String, Object>
for a object,List<Object>
for an array, or the value type (int, boolean, etc).Since
parse()
throws, we can wrap it in a try/catch for our TryParse method.We will need to do something similar to JavaScript for GetType(), and our AsType methods will just be explicit casts (eg.
(String)obj
).Python
There is a built in json library for python.
As with the other dynamically typed solutions, our AsType method will simply forward the underlying python object (perhaps with a guard/throw in front for type mismatches).
Swift
There is built in support using the NSJSONSerialization obj-c class (JSONSerialization in swift). I don't really know obj-c / swift at all, but I found the following which may be helpful.
This is an obj-c example:
And a swift example:
The possible types are NSString, NSNumber, NSArray, NSDictionary, or NSNull.
We will probably do the same thing as python / java etc:
One issue is there is no native type for boolean in obj-c (it's just an NSNumber). The best answer I could find regarding detecting whether the value is an integer, double, or bool, is this swift-specific sof answer: https://stackoverflow.com/a/49641305/184746
The text was updated successfully, but these errors were encountered: