Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON parsing support #140

Open
caesay opened this issue Feb 16, 2024 · 14 comments
Open

JSON parsing support #140

caesay opened this issue Feb 16, 2024 · 14 comments
Assignees
Labels
enhancement New feature or request

Comments

@caesay
Copy link
Contributor

caesay commented Feb 16, 2024

Following the pattern of IntTryParse, I propose a new JsonNode class containing a TryParse(string) method.

The instance/non-static members will have information about the type of node that was parsed, and any children.

I do not propose to add json writing at this time, usually that's more useful when you have serialization/deserialization to class support, I think for manual json writing, string interpolation will provide an acceptable but less-than-ideal solution.

I also suggest we allow conversions to more specific derived types.
I have followed sheredom/json.h api as a base line so that we can fully support C/C++

For example:

public enum JsonNodeType
{
    Null,
    Bool,
    Array,
    Object,
    Int,
    Double,
    String
}

public class JsonNode
{
    /// Get the type of this node, such as string, object, array, etc.
    /// You should use this function and then call the corresponding
    /// AsObject, AsArray, AsString, etc. functions to get the actual
    /// parsed json information.
    public JsonNodeType GetType();

    /// Reinterpret a JSON value as an object. Throws exception if the value type was not an object.
    public Dictionary<string(), JsonNode#> AsObject();
    /// Reinterpret a JSON value as an array. Throws exception if the value type was not an array.
    public List<JsonNode#> AsArray();
    /// Reinterpret a JSON value as an integer. Throws exception if the value type was not an integer.
    public int AsInt();
    /// Reinterpret a JSON value as a double. Throws exception if the value type was not a double.
    public double AsDouble();
    /// Reinterpret a JSON value as a boolean. Throws exception if the value type was not a boolean.
    public bool AsBool();
    /// Reinterpret a JSON value as a string. Throws exception if the value type was not a string.
    public string AsString();

    /// Try to parse a json string to a JsonNode object model. If the parsing fails,
    /// the JsonNode.Type will be null and this method will return false.
    public bool TryParse(string jsonText);
    /// Parses the json text into a JsonNode object model, or throws an exception if the parsing fails.
    public void Parse(string jsonText);
}

Some of our underlying json API's provide a "parse or throw" function, and some provide a "try parse or null" function, which is why I propose we expose both Parse and TryParse functions. In the case where the underlying implementation throws, our Parse method will call it directly and our TryParse method will wrap it in exception handling. In the case where the underlying implementation doesn't throw but provides an error message, we will directly call it with TryParse but check for the presence of the error message and explicitly throw in Parse - thereby providing a consistent api across every underlying implementation.

For C/C++

We would need to emit sheredom/json.h ahead of our fut implementation in the output header.

We would use the following function to parse a json string:

struct json_value_s *json_parse_ex(
    const void *src,
    size_t src_size,
    size_t flags_bitset,
    void*(*alloc_func_ptr)(void *, size_t),
    void *user_data,
    struct json_parse_result_s *result);
  • src - a utf-8 json string to parse.
  • src_size - the size of src in bytes.
  • flags_bitset - this should probably be json_parse_flags_e::json_parse_flags_allow_json5
  • alloc_func_ptr - a callback function to use for doing the single allocation. If NULL, malloc() is used. should leave null
  • user_data - should leave null
  • result - the result of the parsing. If a parsing error occurred this will contain what type of error, and where in the source it - occurred. Can be NULL.

In the returned json_parse_result_s, there is an error property - so we could also expose a "Parse" and "TryParse" varient, where the former throws if we wish. In json_value_s there is a type property which we can map to JsonNodeType

The following functions can be used to cast the json_value_s to a more specific type (in the same way we exposed our api).

  • json_value_as_string - returns a value as a string, or null if it wasn't a string.
  • json_value_as_number - returns a value as a number, or null if it wasn't a number.
  • json_value_as_object - returns a value as an object, or null if it wasn't an object.
  • json_value_as_array - returns a value as an array, or null if it wasn't an array.
  • json_value_is_true - returns non-zero is a value was true, zero otherwise.
  • json_value_is_false - returns non-zero is a value was false, zero otherwise.
  • json_value_is_null - returns non-zero is a value was null, zero otherwise.

For C#

There is built-in json parsing in net5.0 and greater (System.Text.Json). For other TFM's (eg. net48), you can add this as a nuget package.

// from https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/use-utf8jsonreader
ReadOnlySpan<byte> jsonReadOnlySpan = Encoding.UTF8.GetBytes(json);
// Read past the UTF-8 BOM bytes if a BOM exists.
var Utf8Bom = new byte[] { 0xEF, 0xBB, 0xBF };
if (jsonReadOnlySpan.StartsWith(Utf8Bom)) {
    jsonReadOnlySpan = jsonReadOnlySpan.Slice(Utf8Bom.Length);
}
var reader = new Utf8JsonReader(jsonReadOnlySpan, new JsonReaderOptions {
    AllowTrailingCommas = true,
    CommentHandling = JsonCommentHandling.Skip,
});
if (JsonDocument.TryParseValue(ref reader, out JsonDocument? doc)) {
    var rootEl = doc.RootElement; // type JsonElement is comparable to our JsonNode
    var type = rootEl.ValueKind; // maps to JsonNodeType
    rootEl.EnumerateObject().ToDictionary(x => x.Name, x => x.Value); // AsObject
    rootEl.EnumerateArray().Cast<JsonElement>().ToArray(); // AsArray
    rootEl.GetString();
    rootEl.GetInt32();
    rootEl.GetDouble();
    rootEl.GetBoolean();
}

The JsonDocument.RootElement here becomes our first JsonNode, and it provides all the functions we need to map to our type.

For Js/Ts

Whether running within browser or nodejs, JSON.parse() will turn a string into a dynamic object.

The AsString, AsInt, etc methods will likely be a no-op, and pass through the underlying javascript object, but we will need to provide some utility on top of this object to check the type. We may want to add a guard/throw to our AsType methods, if there is a type mismatch - to help prevent hard to diagnose runtime errors.

public JsonNodeType GetType()
{
    if (obj === null || obj === undefined) return JsonNodeType.Null;
    if (obj === true || obj === false) return JsonNodeType.Boolean;
    if (Array.isArray(obj)) return JsonNodeType.Array;
    if (typeof obj === 'string' || obj instanceof String) return JsonNodeType.String;
    // https://stackoverflow.com/a/3885844/184746
    if (obj === +obj && obj !== (obj|0)) return JsonNodeType.Float;
    if (obj === +obj && obj === (obj|0)) return JsonNodeType.Int;
    return JsonNodeType.Object;
}

For D

There is native support for json in the D standard library: https://dlang.org/phobos/std_json.html

import std.conv : to;

// parse a file or string of json into a usable structure
string s = `{ "language": "D", "rating": 3.5, "code": "42" }`;
JSONValue j = parseJSON(s);
// j and j["language"] return JSONValue,
// j["language"].str returns a string
writeln(j["language"].str); // "D"
writeln(j["rating"].floating); // 3.5

The JSONValue.type() property (https://dlang.org/phobos/std_json.html#.JSONType) will map nicely to JsonNodeType, and there are a variety of fields which provide the parsed results, such as:

  • JSONValue.str
  • JSONValue.floating
  • JSONValue.integer
  • JSONValue.boolean
  • JSONValue.object() or objectNoRef()
  • JSONValue.array() or arrayNoRef()

Conveniently these properties will throw by default if the type does not match, which aligns with our proposed API.

OpenCL

I don't think we can / should support this platform. Using JsonNode should result in a compiler error.

Java

There are libraries we could pull in for Java, but I'd rather actually pull in this single file lib here to keep external dependencies to a minimum: https://github.com/mitchhentges/json-parse/blob/master/src/main/java/ca/fuzzlesoft/JsonParse.java - probably as an internal/private member of JsonNode so it's not directly exposed in fut or available to call directly in the resulting java code.

There is a static parse method returning Object.

Object will be of type Map<String, Object> for a object, List<Object> for an array, or the value type (int, boolean, etc).

Object obj = JsonParse.parse(jsonText);

Since parse() throws, we can wrap it in a try/catch for our TryParse method.

We will need to do something similar to JavaScript for GetType(), and our AsType methods will just be explicit casts (eg. (String)obj).

public JsonNodeType GetType()
{
    if (obj == null) return JsonNodeType.Null;
    if (obj instanceof Long) return JsonNodeType.Int;
    if (obj instanceof Double) return JsonNodeType.Double;
    if (obj instanceof String) return JsonNodeType.String;
    if (obj instanceof Boolean) return JsonNodeType.Bool;
    if (obj instanceof Map) return JsonNodeType.Object;
    if (obj instanceof List) return JsonNodeType.Array;
}

Python

There is a built in json library for python.

import json
obj = json.loads('{"one" : "1", "two" : "2", "three" : "3"}')
print(obj['two']) 

As with the other dynamically typed solutions, our AsType method will simply forward the underlying python object (perhaps with a guard/throw in front for type mismatches).

public JsonNodeType GetType()
{
    if (obj is None) return JsonNodeType.Null;
    if (isinstance(obj, int)) return JsonNodeType.Int;
    if (isinstance(obj, float)) return JsonNodeType.Double;
    if (isinstance(obj, str)) return JsonNodeType.String;
    if (isinstance(obj, bool)) return JsonNodeType.Bool;
    if (isinstance(obj, list)) return JsonNodeType.Array;
    if (isinstance(obj, dict)) return JsonNodeType.Object;
}

Swift

There is built in support using the NSJSONSerialization obj-c class (JSONSerialization in swift). I don't really know obj-c / swift at all, but I found the following which may be helpful.

This is an obj-c example:

NSError *jsonError = nil;
id jsonObject = [NSJSONSerialization JSONObjectWithData:jsonData options:kNilOptions error:&jsonError];

if ([jsonObject isKindOfClass:[NSArray class]]) {
    NSLog(@"its an array!");
    NSArray *jsonArray = (NSArray *)jsonObject;
    NSLog(@"jsonArray - %@",jsonArray);
}

And a swift example:

let jsonString = "{\"boolean_key\" : true}"
let jsonData = jsonString.data(using: .utf8)!
let json = try! JSONSerialization.jsonObject(with: jsonData, options: .mutableContainers) as! [String:Any]

The possible types are NSString, NSNumber, NSArray, NSDictionary, or NSNull.

We will probably do the same thing as python / java etc:

if ([obj isKindOfClass:[NSString class]]) return JsonNodeType.String

One issue is there is no native type for boolean in obj-c (it's just an NSNumber). The best answer I could find regarding detecting whether the value is an integer, double, or bool, is this swift-specific sof answer: https://stackoverflow.com/a/49641305/184746

@pfusik
Copy link
Collaborator

pfusik commented Feb 16, 2024

Excellent work so far!

Quick comments:

Does JSON support int ? I thought it's only double.

Why wrap in JsonObject and JsonArray ? The methods could return the Dictionary and List directly.

I don't know what the implications of Dictionary<> are for C

Dictionary transpiles to GHashTable while List transpiles to GArray.

@pfusik pfusik added the enhancement New feature or request label Feb 16, 2024
@caesay
Copy link
Contributor Author

caesay commented Feb 16, 2024

I actually started out with JsonNodeType.Number instead of Float/Int.

Since JSON is based on Javascript, the only difference between int/double is whether there is a decimal. In Javascript these are both just the same "Number" type.

{ "int": 1 }
// vs
{ "float": 1.0 }

In other strongly typed languages it's common to provide the distinction, because if you only offered the "AsDouble" function the developer will have to constantly parse/round the double's to int's everywhere which is annoying and requires two rounds of parsing.

Regarding JsonObject / JsonArray, I did this because most other Json libraries do it this way, so we will have a better mapping between our types and the underlying languages - but also because we may eventually want to provide additional properties which only exist on the JsonArray / JsonObject classes. However, I don't mind either way. If you prefer we drop those derived classes it will be fine by me.

@caesay
Copy link
Contributor Author

caesay commented Feb 16, 2024

I have updated the issue with the remaining languages and incorporated your feedback about removing the derived classes. I have kept the Int/double representations, because all of our underlying implementations support this except for swift.

I am not really sure how to go about implementing this (as I'm currently just struggling to add a ToLower/ToUpper function...) but if you have some suggestions on where I should start I could try.

pfusik added a commit that referenced this issue Feb 16, 2024
@pfusik
Copy link
Collaborator

pfusik commented Feb 16, 2024

See the above commit. It defines the API in AST.fu, adds a test and handles the C# generation.

I chose naming closer to .NET's. I don't have a strong opinion on int and True/False yet, it can change as I add next backends.

The definitions in AST.fu are very verbose and the documentation is written manually. Eventually I'd like them to be parsed from Fusion code.

pfusik added a commit that referenced this issue Feb 16, 2024
pfusik added a commit that referenced this issue Feb 19, 2024
pfusik added a commit that referenced this issue Feb 19, 2024
@caesay
Copy link
Contributor Author

caesay commented Feb 19, 2024

In the event that it's helpful, I also took a stab at writing a Json parser in Fusion. I wasn't sure how hard it would be or if it would be desirable over relying on each language's built-in parser.

public enum JsonNodeType
{
    Null,
    Bool,
    Array,
    Object,
    Number,
    String
}

enum JsonToken 
{ 
    None, 
    CurlyOpen, 
    CurlyClose, 
    SquareOpen, 
    SquareClose, 
    Colon, 
    Comma, 
    String, 
    Number, 
    Bool, 
    Null
}

public class JsonParseException : Exception 
{ }

public class JsonNode
{
    JsonNodeType Type = JsonNodeType.Null;
    Dictionary<string(), JsonNode#>() ObjectValue;
    List<JsonNode#>() ArrayValue;
    string() StringValue;
    double NumberValue;
    bool BoolValue;

    /// Get the type of this node, such as string, object, array, etc.
    /// You should use this function and then call the corresponding
    /// AsObject, AsArray, AsString, etc. functions to get the actual
    /// parsed json information.
    public JsonNodeType GetType()
    {
        return Type;
    }

    /// Check if the JSON value is null.
    public bool IsNull()
    {
        return Type == JsonNodeType.Null;
    }

    /// Reinterpret a JSON value as an object. Throws exception if the value type was not an object.
    public Dictionary<string(), JsonNode#> AsObject() throws Exception
    {
        if (Type != JsonNodeType.Object)
        {
            throw Exception("Cannot call AsObject on JsonNode which is not an object.");
        }
        return ObjectValue;
    }
    /// Reinterpret a JSON value as an array. Throws exception if the value type was not an array.
    public List<JsonNode#> AsArray() throws Exception
    {
        if (Type != JsonNodeType.Array)
        {
            throw Exception("Cannot call AsArray on JsonNode which is not an array.");
        }
        return ArrayValue;
    }

    /// Reinterpret a JSON value as a number. Throws exception if the value type was not a double.
    public double AsNumber() throws Exception
    {
        if (Type != JsonNodeType.Number)
        {
            throw Exception("Cannot call AsNumber on JsonNode which is not a number.");
        }
        return NumberValue;
    }

    /// Reinterpret a JSON value as a boolean. Throws exception if the value type was not a boolean.
    public bool AsBool() throws Exception
    {
        if (Type != JsonNodeType.Bool)
        {
            throw Exception("Cannot call AsBool on JsonNode which is not a boolean.");
        }
        return BoolValue;
    }

    /// Reinterpret a JSON value as a string. Throws exception if the value type was not a string.
    public string AsString() throws Exception
    {
        if (Type != JsonNodeType.String)
        {
            throw Exception("Cannot call AsString on JsonNode which is not a string.");
        }
        return StringValue;
    }

    public static JsonNode# Parse(string text) throws Exception, JsonParseException
    {
        JsonParser# parser = new JsonParser();
        parser.Load(text);
        return parser.ParseValue();
    }

    internal void InitBool!(bool value) throws JsonParseException
    {
        if (Type != JsonNodeType.Null)
        {
            throw JsonParseException("Cannot call InitBool on JsonNode which is not null.");
        }
        Type = JsonNodeType.Bool;
        BoolValue = value;
    }

    internal void InitArray!() throws JsonParseException
    {
        if (Type != JsonNodeType.Null)
        {
            throw JsonParseException("Cannot call InitArray on JsonNode which is not null.");
        }
        Type = JsonNodeType.Array;
    }

    internal void AddArrayChild!(JsonNode# child) throws JsonParseException
    {
        if (Type != JsonNodeType.Array)
        {
            throw JsonParseException("Cannot call AddArrayChild on JsonNode which is not an array.");
        }
        ArrayValue.Add(child);
    }

    internal void InitObject!() throws JsonParseException
    {
        if (Type != JsonNodeType.Null)
        {
            throw JsonParseException("Cannot call InitObject on JsonNode which is not null.");
        }
        Type = JsonNodeType.Object;
    }

    internal void AddObjectChild!(string key, JsonNode# child) throws JsonParseException
    {
        if (Type != JsonNodeType.Object)
        {
            throw JsonParseException("Cannot call AddObjectChild on JsonNode which is not an object.");
        }
        ObjectValue[key] = child;
    }

    internal void InitNumber!(double value) throws JsonParseException
    {
        if (Type != JsonNodeType.Null)
        {
            throw JsonParseException("Cannot call InitNumber on JsonNode which is not null.");
        }
        Type = JsonNodeType.Number;
        NumberValue = value;
    }

    internal void InitString!(string value) throws JsonParseException
    {
        if (Type != JsonNodeType.Null)
        {
            throw JsonParseException("Cannot call InitString on JsonNode which is not null.");
        }
        Type = JsonNodeType.String;
        StringValue = value;
    }
}

class StringAppendable
{
    StringWriter() builder;
    TextWriter! writer;
    bool initialised;

    public void Clear!()
    {
        builder.Clear();
    }

    public void WriteChar!(int c)
    {
        if (!initialised)
        {
            writer = builder;
            initialised = true;
        }
        writer.WriteChar(c);
    }

    public string() ToString()
    {
        return builder.ToString();
    }
}

// https://github.com/gering/Tiny-JSON/blob/master/Tiny-JSON/Tiny-JSON/JsonParser.cs
class JsonParser
{
    string() text = "";
    int position = 0;
    StringAppendable() builder;

    public void Load!(string text)
    {
        this.text = text;
        this.position = 0;
    }

    public bool EndReached()
    {
        return position >= text.Length;
    }

    public string() ReadN!(int n) throws JsonParseException
    {
        if (position + n > text.Length)
        {
            throw JsonParseException("Unexpected end of input");
        }
        string() result = text.Substring(position, n);
        position += n;
        return result;
    }

    public int Read!()
    {
        if (position >= text.Length)
        {
            return -1;
        }
        int c = text[position];
        position++;
        return c;
    }

    public int Peek()
    {
        if (position >= text.Length)
        {
            return -1;
        }
        return text[position];
    }

    public bool PeekWhitespace()
    {
        int c = Peek();
        return c == ' ' || c == '\t' || c == '\n' || c == '\r';
    }

    public bool PeekWordbreak()
    {
        int c = Peek();
        return c == ' ' || c == ',' || c == ':' || c == '\"' || c == '{' || c == '}' 
            || c == '[' || c == ']' || c == '\t' || c == '\n' || c == '\r' || c == '/';
    }

    JsonToken PeekToken!() {
        EatWhitespace();
        if (EndReached()) return JsonToken.None;
        switch (Peek()) {
            case '{':
                return JsonToken.CurlyOpen;
            case '}':
                return JsonToken.CurlyClose;
            case '[':
                return JsonToken.SquareOpen;
            case ']':
                return JsonToken.SquareClose;
            case ',':
                return JsonToken.Comma;
            case '"':
                return JsonToken.String;
            case ':':
                return JsonToken.Colon;
            case '0':
            case '1':
            case '2':
            case '3':
            case '4':
            case '5':
            case '6':
            case '7':
            case '8':
            case '9':
            case '-':
                return JsonToken.Number;
            case 't':
            case 'f':
                return JsonToken.Bool;
            case 'n':
                return JsonToken.Null;
            case '/':
                // ignore / skip past line and blockcomments
                Read(); // skip the first /
                if (Peek() == '/') { // line comment, read to next \n
                    while (!EndReached() && Peek() != '\n') {
                        Read();
                    }
                    return PeekToken();
                } else if (Peek() == '*') { // block comment, read to */
                    Read(); // skip the *
                    while (!EndReached()) {
                        if (Read() == '*' && Peek() == '/') {
                            Read(); // skip the /
                            return PeekToken();
                        }
                    }
                }
                return JsonToken.None;
            default:
                return JsonToken.None;
        }
    }

    public void EatWhitespace!()
    {
        while (!EndReached() && PeekWhitespace())
        {
            Read();
        }
    }

    public string() ReadWord!()
    {
        builder.Clear();
        while (!EndReached() && !PeekWordbreak())
        {
            builder.WriteChar(Read());
        }
        return builder.ToString();
    }

    public JsonNode# ParseNull!() throws JsonParseException 
    {
        ReadWord();
        JsonNode# node = new JsonNode();
        return node;
    }

    public JsonNode# ParseBool!() throws JsonParseException 
    {
        string() boolValue = ReadWord();
        if (boolValue == "true") {
            JsonNode# node = new JsonNode();
            node.InitBool(true);
            return node;
        } else if (boolValue == "false") {
            JsonNode# node = new JsonNode();
            node.InitBool(false);
            return node;
        } else {
            throw JsonParseException("Invalid boolean");
        }
    }

    public JsonNode# ParseNumber!() throws JsonParseException 
    {
        double d;
        if (d.TryParse(ReadWord())) {
            JsonNode# node = new JsonNode();
            node.InitNumber(d);
            return node;
        }

        throw JsonParseException("Invalid number");
    }

    public JsonNode# ParseString!() throws JsonParseException
    {
        builder.Clear();
        Read(); // ditch opening quote
      
        while (true) {
            if (EndReached()) {
                throw JsonParseException("Unterminated string");
            }
            int c = Read();
            switch (c) {
                case '"':
                    JsonNode# node = new JsonNode();
                    node.InitString(builder.ToString());
                    return node;
                case '\\':
                    if (EndReached()) {
                        throw JsonParseException("Unterminated string");
                    }
                    
                    c = Read();
                    switch (c) {
                        case '"':
                        case '\\':
                        case '/':
                            builder.WriteChar(c);
                            break;
                        case 'b':
                            builder.WriteChar(0x0008); // backspace
                            break;
                        case 'f':
                            builder.WriteChar(0x000C); // form feed
                            break;
                        case 'n':
                            builder.WriteChar('\n');
                            break;
                        case 'r':
                            builder.WriteChar('\r');
                            break;
                        case 't':
                            builder.WriteChar('\t');
                            break;
                        case 'u':
                            int i;
                            if (i.TryParse(ReadN(4), 16)) {
                                builder.WriteChar(i);
                            } else {
                                throw JsonParseException("Invalid unicode escape");
                            }
                            break;
                    }
                    break;
                default:
                    builder.WriteChar(c);
                    break;
            }
        }
    }

    public JsonNode# ParseObject!() throws Exception, JsonParseException
    {
        Read(); // ditch opening brace
        JsonNode# node = new JsonNode();
        node.InitObject();

        while (true) {
            switch (PeekToken()) {
            case JsonToken.None:
                throw JsonParseException("Unterminated object");
            case JsonToken.Comma:
                Read(); // ditch comma
                continue;
            case JsonToken.CurlyClose:
                Read(); // ditch closing brace
                return node;
            default:
                JsonNode# name = ParseString();

                if (PeekToken() != JsonToken.Colon) throw JsonParseException("Expected colon");
                Read(); // ditch the colon

                node.AddObjectChild(name.AsString(), ParseValue());
                break;
            }
        }
    }

	public JsonNode# ParseArray!() throws Exception, JsonParseException
    {
        Read(); // ditch opening brace
        JsonNode# node = new JsonNode();
        node.InitArray();

        bool expectComma = false;
        while (true) {
            switch (PeekToken()) {
            case JsonToken.None:
                throw JsonParseException("Unterminated array");		
            case JsonToken.Comma:
                if (!expectComma) {
                    throw JsonParseException("Unexpected comma in array");
                }
                expectComma = false;
                Read(); // ditch comma
                continue;						
            case JsonToken.SquareClose:	
                Read(); // ditch closing brace
                return node;
            default:
                if (expectComma) {
                    throw JsonParseException("Expected comma");
                }
                expectComma = true;
                node.AddArrayChild(ParseValue());
                break;
            }
        }
    }

    public JsonNode# ParseValue!() throws Exception, JsonParseException
    {
        switch (PeekToken()) {
        case JsonToken.String:		
            return ParseString();
        case JsonToken.Number:		
            return ParseNumber();
        case JsonToken.Bool:		
            return ParseBool();
        case JsonToken.Null:		
            return ParseNull();
        case JsonToken.CurlyOpen:	
            return ParseObject();
        case JsonToken.SquareOpen:	
            return ParseArray();
        default:
            throw JsonParseException("Invalid token");
        }
    }
}

pfusik added a commit that referenced this issue Feb 19, 2024
@pfusik
Copy link
Collaborator

pfusik commented Feb 19, 2024

@caesay
Copy link
Contributor Author

caesay commented Feb 19, 2024

I don't mind, the only C/C++ parsers I have experience with is nholman and sheredom but as long as it gets the job done. What are your thoughts in general about whether we should write stdlib libraries in Fusion (like above) vs. relying on language specific ones?

@pfusik
Copy link
Collaborator

pfusik commented Feb 19, 2024

What Fusion emits needs to feel like written by an experienced programmer directly in the target language.
Therefore it doesn't implement e.g. List, but translates it into whatever is available, even though implementing List in Fusion once would be much easier to do.
I don't have a strong opinion on what to do for features not available in the target language standard library. On one hand we should not add dependencies for trivial code such as https://www.npmjs.com/package/is-even, on the other hand, don't reinvent the wheel when we have glib and ICU.
JSON is neither trivial nor rocket science and even when we plugin an existing parser, we need to translate the different representations of arrays etc. It might turn out that a pure Fusion parser is more straightforward than a native parser + glue.

@pfusik
Copy link
Collaborator

pfusik commented Feb 26, 2024

if (json.ValueKind == JsonValueKind.String) is verbose and I feel if (json.IsString()) both reads better and avoids the awkward translation of runtime types to an enum.

Turns out JSON parsing is not that hard. Here's how I would do it:

public abstract class JsonElement
{
	public virtual bool IsObject() => false;
	public virtual bool IsArray() => false;
	public virtual bool IsString() => false;
	public virtual bool IsNumber() => false;
	public virtual bool IsBoolean() => false;
	public virtual bool IsTrue() => false;
	public virtual bool IsNull() => false;

	public virtual Dictionary<string(), JsonElement#> GetObject()
	{
		assert false;
	}

	public virtual List<JsonElement#> GetArray()
	{
		assert false;
	}

	public virtual string GetString()
	{
		assert false;
	}

	public virtual double GetNumber()
	{
		assert false;
	}

	public static JsonElement#? TryParse(string s)
	{
		JsonParser() parser;
		return parser.TryParse(s);
	}
}

class JsonObject : JsonElement
{
	internal Dictionary<string(), JsonElement#>() Value;
	public override bool IsObject() => true;
	public override Dictionary<string(), JsonElement#> GetObject() => Value;
}

class JsonArray : JsonElement
{
	internal List<JsonElement#>() Value;
	public override bool IsArray() => true;
	public override List<JsonElement#> GetArray() => Value;
}

class JsonString : JsonElement
{
	internal string() Value;
	public override bool IsString() => true;
	public override string GetString() => Value;
}

class JsonNumber : JsonElement
{
	internal double Value;
	public override bool IsNumber() => true;
	public override double GetNumber() => Value;
}

class JsonBoolean : JsonElement
{
	public override bool IsBoolean() => true;
}

class JsonTrue : JsonBoolean
{
	public override bool IsTrue() => true;
}

class JsonNull : JsonElement
{
	public override bool IsNull() => true;
}

class JsonParser
{
	string Input;
	int Offset;
	int InputLength;

	bool SkipWhitespace!()
	{
		while (Offset < InputLength) {
			switch (Input[Offset]) {
			case '\t':
			case '\n':
			case '\r':
			case ' ':
				break;
			default:
				return true;
			}
			Offset++;
		}
		return false;
	}

	JsonObject#? ParseObject!()
	{
		Offset++;
		if (!SkipWhitespace())
			return null;
		JsonObject# result = new JsonObject();
		if (Input[Offset] == '}') {
			Offset++;
			return result;
		}
		while (Input[Offset] == '"') {
			JsonString#? key = ParseString();
			if (key == null || !SkipWhitespace() || Input[Offset] != ':')
				return null;
			Offset++;
			JsonElement#? value = ParseWhitespaceAndElement();
			if (value == null || !SkipWhitespace())
				return null;
			switch (Input[Offset]) {
			case ',':
				break;
			case '}':
				Offset++;
				return result;
			default:
				return null;
			}
			Offset++;
			if (!SkipWhitespace())
				return null;
			result.Value[key.Value] = value;
		}
		return null;
	}

	JsonArray#? ParseArray!()
	{
		Offset++;
		if (!SkipWhitespace())
			return null;
		JsonArray# result = new JsonArray();
		if (Input[Offset] == ']') {
			Offset++;
			return result;
		}
		for (;;) {
			JsonElement#? element = ParseElement();
			if (element == null || !SkipWhitespace())
				return null;
			switch (Input[Offset]) {
			case ',':
				break;
			case ']':
				Offset++;
				return result;
			default:
				return null;
			}
			Offset++;
			if (!SkipWhitespace())
				return null;
			result.Value.Add(element);
		}
	}

	JsonString#? ParseString!()
	{
		Offset++;
		StringWriter() result;
		int startOffset = 0;
		while (Offset < InputLength) {
			switch (Input[Offset]) {
			case 0:
			case 1:
			case 2:
			case 3:
			case 4:
			case 5:
			case 6:
			case 7:
			case 8:
			case 9:
			case 10:
			case 11:
			case 12:
			case 13:
			case 14:
			case 15:
			case 16:
			case 17:
			case 18:
			case 19:
			case 20:
			case 21:
			case 22:
			case 23:
			case 24:
			case 25:
			case 26:
			case 27:
			case 28:
			case 29:
			case 30:
			case 31:
				return null;
			case '"':
				result.Write(Input.Substring(startOffset, Offset++ - startOffset));
				return new JsonString { Value = result.ToString() };
			case '\\':
				result.Write(Input.Substring(startOffset, Offset++ - startOffset));
				if (Offset >= InputLength)
					return null;
				switch (Input[Offset]) {
				case '"':
				case '\\':
				case '/':
					startOffset = Offset++;
					continue;
				case 'b':
					result.WriteChar(8);
					break;
				case 'f':
					result.WriteChar(12);
					break;
				case 'n':
					result.WriteChar('\n');
					break;
				case 'r':
					result.WriteChar('\r');
					break;
				case 't':
					result.WriteChar('\t');
					break;
				case 'u':
					if (Offset + 5 >= InputLength)
						return null;
					int c;
					if (!c.TryParse(Input.Substring(Offset + 1, 4), 16))
						return null;
					result.WriteCodePoint(c);
					Offset += 4;
					break;
				default:
					return null;
				}
				startOffset = ++Offset;
				break;
			default:
				Offset++;
				break;
			}
		}
		return null;
	}

	bool SeeDigit() => Offset < InputLength && Input[Offset] >= '0' && Input[Offset] <= '9';

	void ParseDigits!()
	{
		while (SeeDigit())
			Offset++;
	}

	JsonNumber#? ParseNumber!()
	{
		int startOffset = Offset;
		if (Input[Offset] == '-')
			Offset++;
		if (!SeeDigit())
			return null;
		if (Input[Offset++] > '0')
			ParseDigits();
		if (Offset < InputLength && Input[Offset] == '.') {
			Offset++;
			if (!SeeDigit())
				return null;
			ParseDigits();
		}
		if (Offset < InputLength && (Input[Offset] | 0x20) == 'e') {
			if (++Offset < InputLength && (Input[Offset] == '+' || Input[Offset] == '-'))
				Offset++;
			if (!SeeDigit())
				return null;
			ParseDigits();
		}
		double d;
		if (!d.TryParse(Input.Substring(startOffset, Offset - startOffset)))
			return null;
		return new JsonNumber { Value = d };
	}

	bool ParseKeyword!(string s)
	{
		foreach (int c in s) {
			if (++Offset >= InputLength || Input[Offset] != c)
				return false;
		}
		Offset++;
		return true;
	}

	JsonElement#? ParseElement!()
	{
		switch (Input[Offset]) {
		case '{':
			return ParseObject();
		case '[':
			return ParseArray();
		case '"':
			return ParseString();
		case '-':
		case '0':
		case '1':
		case '2':
		case '3':
		case '4':
		case '5':
		case '6':
		case '7':
		case '8':
		case '9':
			return ParseNumber();
		case 't':
			return ParseKeyword("rue") ? new JsonTrue() : null;
		case 'f':
			return ParseKeyword("alse") ? new JsonBoolean() : null;
		case 'n':
			return ParseKeyword("ull") ? new JsonNull() : null;
		default:
			return null;
		}
	}

	JsonElement#? ParseWhitespaceAndElement!() => SkipWhitespace() ? ParseElement() : null;

	internal JsonElement#? TryParse!(string s)
	{
		Input = s;
		Offset = 0;
		InputLength = s.Length;
		JsonElement#? result = ParseWhitespaceAndElement();
		return SkipWhitespace() ? null : result;
	}
}

(NOT TESTED YET!)

I will change the API to the above, and probably emit this pure-Fusion implementation for C and C++ instead of adding an extra dependency and converting the results.

@caesay
Copy link
Contributor Author

caesay commented Feb 26, 2024

I think it is a good idea if we can start expanding the standard library like this with pure fusion code, rather than what we have via string building. Ideally it would very easy to add new functionality written in Fusion - maybe a new folder full of fusion code which if referenced automatically gets transpiled in.

pfusik added a commit that referenced this issue Feb 27, 2024
pfusik added a commit that referenced this issue Feb 27, 2024
pfusik added a commit that referenced this issue Feb 27, 2024
@pfusik pfusik self-assigned this Feb 28, 2024
pfusik added a commit that referenced this issue Feb 28, 2024
pfusik added a commit that referenced this issue Feb 28, 2024
pfusik added a commit that referenced this issue Feb 28, 2024
pfusik added a commit that referenced this issue Feb 28, 2024
@pfusik
Copy link
Collaborator

pfusik commented Feb 29, 2024

Current status:

  • D, JavaScript/TypeScript, Python, Swift transpile to their built-in parsers
  • C++ and Java work as pure Fusion, although I still consider javax.json or org.json for Java
  • C is planned to work as pure Fusion, but has bugs at the moment

Open topics:

  • Only implemented GetDouble for now. I did not check if the targets are consistent in handling int and long: do they round (how?) or fail on fractions / out-of-range? Some targets seem to only implement double.
  • Error handling. Most implementations seem to have just a "parse" method that throws an exception. The exceptions are target-specific and I don't see how they could be handled in Fusion code. The pure-Fusion solution returns null on error, which could be suitable for TryParse. The Parse could be built on top of that with assert result != null.
  • Not sure if all targets should do runtime checks in the GetXXX methods, especially that the user can do if (json.IsXXX()) or assert json.IsXXX().

I think it is a good idea if we can start expanding the standard library like this with pure fusion code, rather than what we have via string building. Ideally it would very easy to add new functionality written in Fusion - maybe a new folder full of fusion code which if referenced automatically gets transpiled in.

Fusion started as a language for operating on raw bytes, much like assembly language. Strings, collections and regexes are all later additions.

Can you give some examples on what could land in the pure-Fusion standard library?

Perhaps it's better done as packages than a standard library?

@caesay
Copy link
Contributor Author

caesay commented Feb 29, 2024

  • I think only GetDouble is fine, the fusion truncate/round functions are easy to use.
  • The benefit of a "Parse" method comes from getting some meaningful data back from the error message (eg. Expected } at position 302). If we can't provide this, I don't think there's any point to add.
  • I think the GetXXX functions throwing is a good idea because it helps prevent undefined behavior. We would expect developers writing fusion code to do if (node.IsString()) { do something with string } but if they skip that check and simply {do something with string} this will be undefined behavior in dynamic languages.

Re. packages that could be a good idea, but in this case, there needs to be a way for people to depend on / download / and update packages. This seems like a lot more work.

@caesay
Copy link
Contributor Author

caesay commented Mar 3, 2024

I tried JsonElement in master, I guess the pure fusion version is not automatically emitted for C++ yet, is this planned or is it expected that I should copy that file in when compiling for C++ only?

@pfusik
Copy link
Collaborator

pfusik commented Mar 4, 2024

Pure-Fusion implementations are not yet built into fut, see https://github.com/fusionlanguage/fut/blob/4a3977e9a8a097e31933e06dd54e592f2d3a6295/test/JsonElement.fu
The C output leaks memory due to #26. While I could easily work around it, my priority is to fix this problem, which will take some time.

I'm going to make a release of fut today.

pfusik added a commit that referenced this issue Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants