Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip json comments #1448

Merged
merged 5 commits into from
Jun 11, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .gitattributes
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Auto detect text files and perform LF normalization
* text=auto
* text=auto eol=lf

# Custom for Visual Studio
*.cs text diff=csharp
Expand All @@ -22,6 +22,8 @@
*.RTF diff=astextplain

*.sh text eol=lf
*.cmd text eol=crlf
*.bat text eol=crlf

*.png binary
*.exe binary
Expand Down
10 changes: 7 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ Type providers consist of two components:
(that are mapped to runtime components by the compiler).

We need a _runtime component_ for .NET Standard 2.0 (netstandard2.0). We also need a _design time_
component for each, to be able to to host the type provider in .NET Core-based tooling.
component for each, to be able to host the type provider in .NET Core-based tooling.

The _runtime_ components are in the following project:

Expand Down Expand Up @@ -68,7 +68,7 @@ of files, typically like this:
the common API in `StructureInference.fs`.

* `JsonGenerator.fs` - implements code that generates provided types, adds properties
and methods etc. This uses the information infered by inference and it generates
and methods etc. This uses the information inferred by inference and it generates
calls to the runtime components.

* `JsonProvider.fs` - entry point that defines static properties of the type provider,
Expand All @@ -79,7 +79,11 @@ between _runtime_ and _design-time_ components, so you'll find at least two file

### Debugging

To debug the type generation, the best way is to change `FSharp.Data.DesignTime` project to a Console application, rename `Test.fsx` to `Test.fs` and hit the Run command in the IDE, setting the breakpoints where you need them. This will invoke all the type providers manually without locking the files in Visual Studio / Xamarin Studio. You'll also see in the console output the complete dump of the generated types and expressions. This is also the process used for the signature tests.
To debug the type generation, the best way is to change `FSharp.Data.DesignTime` project to a Console application,
rename `Test.fsx` to `Test.fs` and hit the Run command in the IDE, setting the breakpoints where you need them.
This will invoke all the type providers manually without locking the files in Visual Studio / Xamarin Studio.
You'll also see in the console output the complete dump of the generated types and expressions.
This is also the process used for the signature tests.

## Documentation

Expand Down
2 changes: 1 addition & 1 deletion src/CommonProviderImplementation/Helpers.fs
Original file line number Diff line number Diff line change
Expand Up @@ -338,7 +338,7 @@ module internal ProviderHelpers =
let private providedTypesCache = createInMemoryCache (TimeSpan.FromSeconds 30.0)
let private activeDisposeActions = HashSet<_>()

// Cache generated types for a short time, since VS invokes the TP multiple tiems
// Cache generated types for a short time, since VS invokes the TP multiple times
// Also cache temporarily during partial invalidation since the invalidation of one TP always causes invalidation of all TPs
let internal getOrCreateProvidedType
(cfg: TypeProviderConfig)
Expand Down
2 changes: 1 addition & 1 deletion src/CommonRuntime/StructuralInference.fs
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ let rec subtypeInfered allowEmptyValues ot1 ot2 =
| InferedType.Heterogeneous h, other
| other, InferedType.Heterogeneous h ->
// Add the other type as another option. We should never add
// heterogenous type as an option of other heterogeneous type.
// heterogeneous type as an option of other heterogeneous type.
assert (typeTag other <> InferedTypeTag.Heterogeneous)
InferedType.Heterogeneous(unionHeterogeneousTypes allowEmptyValues h (Map.ofSeq [ typeTag other, other ]))

Expand Down
1 change: 1 addition & 0 deletions src/CommonRuntime/StructuralTypes.fs
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ type InferedTypeTag =
| Number
| Boolean
| String
/// Allow for support of embedded json in e.g. xml documents
| Json
| DateTime
| TimeSpan
Expand Down
10 changes: 5 additions & 5 deletions src/CommonRuntime/TextConversions.fs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// --------------------------------------------------------------------------------------
// --------------------------------------------------------------------------------------
// Helper operations for converting converting string values to other types
// --------------------------------------------------------------------------------------

Expand Down Expand Up @@ -86,8 +86,8 @@ type TextConversions private () =
static member val private DefaultRemovableAdornerCharacters =
Set.union TextConversions.DefaultNonCurrencyAdorners TextConversions.DefaultCurrencyAdorners

//This removes any adorners that might otherwise casue the inference to infer string. A notable a change is
//Currency Symbols are now treated as an Adorner like a '%' sign thus are now independant
//This removes any adorners that might otherwise cause the inference to infer string. A notable a change is
//Currency Symbols are now treated as an Adorner like a '%' sign thus are now independent
//of the culture. Which is probably better since we have lots of scenarios where we want to
//consume values prefixed with € or $ but in a different culture.
static member private RemoveAdorners(value: string) =
Expand Down Expand Up @@ -157,7 +157,7 @@ type TextConversions private () =
| x -> x

static member AsDateTimeOffset cultureInfo (text: string) =
// get TimeSpan presentation from 4-digt integers like 0000 or -0600
// get TimeSpan presentation from 4-digit integers like 0000 or -0600
let getTimeSpanFromHourMin (hourMin: int) =
let hr = (hourMin / 100) |> float |> TimeSpan.FromHours
let min = (hourMin % 100) |> float |> TimeSpan.FromMinutes
Expand Down Expand Up @@ -202,7 +202,7 @@ module internal UnicodeHelper =
// used http://en.wikipedia.org/wiki/UTF-16#Code_points_U.2B010000_to_U.2B10FFFF as a guide below
let getUnicodeSurrogatePair num =
// only code points U+010000 to U+10FFFF supported
// for coversion to UTF16 surrogate pair
// for conversion to UTF16 surrogate pair
let codePoint = num - 0x010000u
let HIGH_TEN_BIT_MASK = 0xFFC00u // 1111|1111|1100|0000|0000
let LOW_TEN_BIT_MASK = 0x003FFu // 0000|0000|0011|1111|1111
Expand Down
4 changes: 2 additions & 2 deletions src/Csv/CsvInference.fs
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ let private parseTypeAndUnit unitsOfMeasureProvider str =

/// Parse schema specification for column. This can either be a name
/// with type or just type: name (typeInfo)|typeInfo.
/// If forSchemaOverride is set to true, only Full or Name is returne
/// If forSchemaOverride is set to true, only Full or Name is returned
/// (if we succeed we override the inferred schema, otherwise, we just
/// override the header name)
let private parseSchemaItem unitsOfMeasureProvider str forSchemaOverride =
Expand Down Expand Up @@ -424,7 +424,7 @@ type CsvFile with
/// Infers the types of the columns of a CSV file
/// </summary>
/// <param name="inferRows"> - Number of rows to use for inference. If this is zero, all rows are used</param>
/// <param name="missingValues"> - The set of strings recogized as missing values</param>
/// <param name="missingValues"> - The set of strings recognized as missing values</param>
/// <param name="cultureInfo"> - The culture used for parsing numbers and dates</param>
/// <param name="schema"> - Optional column types, in a comma separated list. Valid types are "int", "int64", "bool", "float", "decimal", "date", "timespan", "guid", "string", "int?", "int64?", "bool?", "float?", "decimal?", "date?", "guid?", "int option", "int64 option", "bool option", "float option", "decimal option", "date option", "guid option" and "string option". You can also specify a unit and the name of the column like this: Name (type&lt;unit&gt;). You can also override only the name. If you don't want to specify all the columns, you can specify by name like this: 'ColumnName=type'</param>
/// <param name="assumeMissingValues"> - Assumes all columns can have missing values</param>
Expand Down
4 changes: 2 additions & 2 deletions src/Csv/CsvProvider.fs
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ type public CsvProvider(cfg: TypeProviderConfig) as this =

let value =
if sample = "" then
// synthetize sample from the schema
// synthesize sample from the schema
use reader = new StringReader(value)

let schemaStr =
Expand Down Expand Up @@ -235,7 +235,7 @@ type public CsvProvider(cfg: TypeProviderConfig) as this =
<param name='AssumeMissingValues'>When set to true, the type provider will assume all columns can have missing values, even if in the provided sample all values are present. Defaults to false.</param>
<param name='PreferOptionals'>When set to true, inference will prefer to use the option type instead of nullable types, <c>double.NaN</c> or <c>""</c> for missing values. Defaults to false.</param>
<param name='Quote'>The quotation mark (for surrounding values containing the delimiter). Defaults to <c>"</c>.</param>
<param name='MissingValues'>The set of strings recogized as missing values specified as a comma-separated string (e.g., "NA,N/A"). Defaults to <c>"""
<param name='MissingValues'>The set of strings recognized as missing values specified as a comma-separated string (e.g., "NA,N/A"). Defaults to <c>"""
+ String.Join(",", TextConversions.DefaultMissingValues)
+ """</c>.</param>
<param name='CacheRows'>Whether the rows should be caches so they can be iterated multiple times. Defaults to true. Disable for large datasets.</param>
Expand Down
2 changes: 1 addition & 1 deletion src/Html/HtmlProvider.fs
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ type public HtmlProvider(cfg: TypeProviderConfig) as this =
<param name='Sample'>Location of an HTML sample file or a string containing a sample HTML document.</param>
<param name='PreferOptionals'>When set to true, inference will prefer to use the option type instead of nullable types, <c>double.NaN</c> or <c>""</c> for missing values. Defaults to false.</param>
<param name='IncludeLayoutTables'>Includes tables that are potentially layout tables (with cellpadding=0 and cellspacing=0 attributes)</param>
<param name='MissingValues'>The set of strings recogized as missing values. Defaults to <c>"""
<param name='MissingValues'>The set of strings recognized as missing values. Defaults to <c>"""
+ String.Join(",", TextConversions.DefaultMissingValues)
+ """</c>.</param>
<param name='Culture'>The culture used for parsing numbers and dates. Defaults to the invariant culture.</param>
Expand Down
4 changes: 2 additions & 2 deletions src/Json/JsonInference.fs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ open FSharp.Data
open FSharp.Data.Runtime
open FSharp.Data.Runtime.StructuralTypes

/// Infer type of a JSON value - this is simple function because most of the
/// Infer type of a JSON value - this is a simple function because most of the
/// functionality is handled in `StructureInference` (most notably, by
/// `inferCollectionType` and various functions to find common subtype), so
/// here we just need to infer types of primitive JSON values.
Expand All @@ -20,7 +20,7 @@ let rec inferType inferTypesFromValues cultureInfo parentName json =
let inline isIntegerFloat (v: float) : bool = Math.Round v = v

match json with
// Null and primitives without subtyping hiearchies
// Null and primitives without subtyping hierarchies
| JsonValue.Null -> InferedType.Null
| JsonValue.Boolean _ -> InferedType.Primitive(typeof<bool>, None, false)
| JsonValue.String s when inferTypesFromValues -> StructuralInference.getInferedTypeFromString cultureInfo s None
Expand Down
2 changes: 1 addition & 1 deletion src/Json/JsonProvider.fs
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ type public JsonProvider(cfg: TypeProviderConfig) as this =
(e.g. 'MyCompany.MyAssembly, resource_name.json'). This is useful when exposing types generated by the type provider.</param>
<param name='InferTypesFromValues'>If true, turns on additional type inference from values.
(e.g. type inference infers string values such as "123" as ints and values constrained to 0 and 1 as booleans.)</param>
<param name='PreferDictionaries'>If true, json record is considered as a dictionary, if the names of all the its fields are infered (by type inference rules) into the same non-string primitive type.</param>"""
<param name='PreferDictionaries'>If true, json records are interpreted as dictionaries when the names of all the fields are infered (by type inference rules) into the same non-string primitive type.</param>"""

do jsonProvTy.AddXmlDoc helpText
do jsonProvTy.DefineStaticParameters(parameters, buildTypes)
Expand Down
70 changes: 54 additions & 16 deletions src/Json/JsonValue.fs
Original file line number Diff line number Diff line change
Expand Up @@ -164,9 +164,6 @@ type private JsonParser(jsonText: string) =
let buf = StringBuilder() // pre-allocate buffers for strings

// Helper functions
let skipWhitespace () =
while i < s.Length && Char.IsWhiteSpace s.[i] do
i <- i + 1

let isNumChar c =
Char.IsDigit c
Expand All @@ -191,9 +188,50 @@ type private JsonParser(jsonText: string) =

let ensure cond = if not cond then throw ()


let rec skipCommentsAndWhitespace () =
let skipComment () =
// Supported comment syntax:
// - // ...{newLine}
// - /* ... */
if i < s.Length && s.[i] = '/' then
i <- i + 1

if i < s.Length && s.[i] = '/' then
i <- i + 1

while i < s.Length && (s.[i] <> '\r' && s.[i] <> '\n') do
i <- i + 1
else if i < s.Length && s.[i] = '*' then
i <- i + 1

while i + 1 < s.Length
&& s.[i] <> '*'
&& s.[i + 1] <> '/' do
i <- i + 1

ensure (i + 1 < s.Length && s.[i] = '*' && s.[i + 1] = '/')
i <- i + 2

true

else
false

let skipWhitespace () =
let initialI = i

while i < s.Length && Char.IsWhiteSpace s.[i] do
i <- i + 1

initialI <> i // return true if some whitespace was skipped

if skipWhitespace () || skipComment () then
skipCommentsAndWhitespace ()

// Recursive descent parser for JSON that uses global mutable index
let rec parseValue cont =
skipWhitespace ()
skipCommentsAndWhitespace ()
ensure (i < s.Length)

match s.[i] with
Expand Down Expand Up @@ -291,16 +329,16 @@ type private JsonParser(jsonText: string) =

and parsePair cont =
let key = parseString ()
skipWhitespace ()
skipCommentsAndWhitespace ()
ensure (i < s.Length && s.[i] = ':')
i <- i + 1
skipWhitespace ()
skipCommentsAndWhitespace ()
parseValue (fun v -> cont (key, v))

and parseObject cont =
ensure (i < s.Length && s.[i] = '{')
i <- i + 1
skipWhitespace ()
skipCommentsAndWhitespace ()
let pairs = ResizeArray<_>()

let parseObjectEnd () =
Expand All @@ -312,16 +350,16 @@ type private JsonParser(jsonText: string) =
if i < s.Length && s.[i] = '"' then
parsePair (fun p ->
pairs.Add(p)
skipWhitespace ()
skipCommentsAndWhitespace ()

let rec parsePairItem () =
if i < s.Length && s.[i] = ',' then
i <- i + 1
skipWhitespace ()
skipCommentsAndWhitespace ()

parsePair (fun p ->
pairs.Add(p)
skipWhitespace ()
skipCommentsAndWhitespace ()
parsePairItem ())
else
parseObjectEnd ()
Expand All @@ -333,7 +371,7 @@ type private JsonParser(jsonText: string) =
and parseArray cont =
ensure (i < s.Length && s.[i] = '[')
i <- i + 1
skipWhitespace ()
skipCommentsAndWhitespace ()
let vals = ResizeArray<_>()

let parseArrayEnd () =
Expand All @@ -345,16 +383,16 @@ type private JsonParser(jsonText: string) =
if i < s.Length && s.[i] <> ']' then
parseValue (fun v ->
vals.Add(v)
skipWhitespace ()
skipCommentsAndWhitespace ()

let rec parseArrayItem () =
if i < s.Length && s.[i] = ',' then
i <- i + 1
skipWhitespace ()
skipCommentsAndWhitespace ()

parseValue (fun v ->
vals.Add(v)
skipWhitespace ()
skipCommentsAndWhitespace ()
parseArrayItem ())
else
parseArrayEnd ()
Expand All @@ -375,15 +413,15 @@ type private JsonParser(jsonText: string) =
// Start by parsing the top-level value
member x.Parse() =
let value = parseValue id
skipWhitespace ()
skipCommentsAndWhitespace ()
if i <> s.Length then throw ()
value

member x.ParseMultiple() =
seq {
while i <> s.Length do
yield parseValue id
skipWhitespace ()
skipCommentsAndWhitespace ()
}

type JsonValue with
Expand Down
2 changes: 1 addition & 1 deletion src/Xml/XmlInference.fs
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ let getInferedTypeFromValue inferTypesFromValues cultureInfo (element: XElement)
InferedType.Primitive(typeof<string>, None, false)

/// Infers type for the element, unifying nodes of the same name
/// accross the entire document (we first get information based
/// across the entire document (we first get information based
/// on just attributes and then use a fixed point)
let inferGlobalType inferTypesFromValues cultureInfo allowEmptyValues (elements: XElement[]) =

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
</Compile>
<Compile Include="TypeProviderInstantiation.fs" />
<Compile Include="InferenceTests.fs" />
<None Include="expected\**\*" />
<None Include="SignatureTestCases.config" />
<Compile Include="SignatureTests.fs" />
<Compile Include="Program.fs" />
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Csv,SmallTest.csv,,,true,false,false,,,
Csv,MSFT.csv,,,true,false,false,,,
Csv,AirQuality.csv,;,,true,false,false,,,
Csv,DnbHistoriskeKurser.csv,,,true,false,false,,fr-FR,
Csv,DnbHistoriskeKurser.csv,,,true,false,false,,nb-NO,
Csv,file with spaces.csv,,,true,false,false,,,
Csv,LastFM.tsv,,,false,false,false,,,
Csv,Titanic.csv,,,true,false,false,,,
Expand Down
Loading