Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline schemas docs #1456

Merged
merged 2 commits into from
Aug 14, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/library/HtmlProvider.fsx
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ The Elexon - BM Reports website provides market data about the U.K's current pow
2014-01-14,4,-61.000,52576.000,-53454.500,18.000,-24.158,0.000,0.000,18.000,-24.158


Usually with HTML files headers are demarked by using the <th> tag, however in this file this is not the case, so the provider assumes that the
Usually with HTML files headers are demarked by using the `<th>` tag, however in this file this is not the case, so the provider assumes that the
first row is headers. (This behaviour is likely to get smarter in later releases). But it highlights a general problem about HTML's strictness.
*)

Expand All @@ -65,7 +65,7 @@ type F1_2017 = HtmlProvider<"../data/2017_F1.htm", ResolutionFolder=ResolutionFo
(**
The generated type provides a type space of tables that it has managed to parse out of the given HTML Document.
Each type's name is derived from either the id, title, name, summary or caption attributes/tags provided. If none of these
entities exist then the table will simply be named `Tablexx` where xx is the position in the HTML document if all of the tables were flatterned out into a list.
entities exist then the table will simply be named `Tablexx` where xx is the position in the HTML document if all of the tables were flattened out into a list.
The `Load` method allows reading the data from a file or web resource. We could also have used a web URL instead of a local file in the sample parameter of the type provider.
The following sample calls the `Load` method with an URL that points to a live version of the same page on wikipedia.
*)
Expand Down
97 changes: 92 additions & 5 deletions docs/library/JsonProvider.fsx
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,10 @@ demonstrate the provider by parsing data from WorldBank and Twitter.

The JSON Type Provider provides statically typed access to JSON documents.
It takes a sample document as an input (or a document containing a JSON array of samples).
The generated type can then be used to read files with the same structure. If the
loaded file does not match the structure of the sample, a runtime error may occur
(but only when accessing e.g. non-existing element).
The generated type can then be used to read files with the same structure.

If the loaded file does not match the structure of the sample, a runtime error may occur
(but only when explicitly accessing an element incompatible with the original sample — e.g. if it is no longer present).

## Introducing the provider

Expand Down Expand Up @@ -168,7 +169,7 @@ object, we would have a `GetSample` method instead.
#### More complex object type on root level

If you want the root type to be an object type, not an array, but
you need more samples at root level, you can use the SampleIsList parameter.
you need more samples at root level, you can use the `SampleIsList` parameter.
Applied to the previous example this would be:

*)
Expand All @@ -181,6 +182,92 @@ let person = People2.Parse("""{ "name":"Gustavo" }""")

(*** include-fsi-merged-output ***)

(**
Note that starting with version 4.2.9 of this package, JSON comments are supported
(Comments are either single-line and start with `//` or multi-line when wrapped in `/*` and `*/`).
This is not a standard feature of JSON, but it can be really convenient,
e.g. to annotate each sample when using multiple ones.
*)

(**
## Type inference hints / inline schemas

Starting with version 4.2.10 of this package, it's possible to enable basic type annotations
directly in the sample used by the provider, to complete or to override type inference.
(Only basic types are supported. See the reference documentation of the provider for the full list)

This feature is disabled by default and has to be explicitly enabled with the `InferenceMode`
static parameter.

Let's consider an example where this can be useful:

*)

type AmbiguousEntity =
JsonProvider<Sample = """
{ "code":"000", "length":"0" }
{ "code":"123", "length":"42" }
{ "code":"4E5", "length":"1.83" }
""",
SampleIsList = true>
let code = (AmbiguousEntity.GetSamples()[1]).Code
let length = (AmbiguousEntity.GetSamples()[1]).Length

(*** include-fsi-merged-output ***)

(**
In the previous example, `Code` is inferred as a `float`,
even though it looks more like it should be a `string`.
(`4E5` is interpreted as an exponential float notation instead of a string)

Now let's enable inline schemas:
*)

open FSharp.Data.Runtime.StructuralInference

type AmbiguousEntity2 =
JsonProvider<Sample = """
{ "code":"typeof<string>", "length":"typeof<float<metre>>" }
{ "code":"123", "length":"42" }
{ "code":"4E5", "length":"1.83" }
""",
SampleIsList = true,
InferenceMode = InferenceMode.ValuesAndInlineSchemasOverrides>
let code2 = (AmbiguousEntity2.GetSamples()[1]).Code
let length2 = (AmbiguousEntity2.GetSamples()[1]).Length

(*** include-fsi-merged-output ***)

(**
With the `ValuesAndInlineSchemasOverrides` inference mode, the `typeof<string>` inline schema
takes priority over the type inferred from other values.
`Code` is now a `string`, as we wanted it to be!

Note that an alternative to obtain the same result would have been to replace all the `Code` values
in the samples with unambiguous string values. (But this can be very cumbersome, especially with big samples)

If we had used the `ValuesAndInlineSchemasHints` inference mode instead, our inline schema
would have had the same precedence as the types inferred from other values, and `Code`
would have been inferred as a choice between either a number or a string,
exactly as if we had added another sample with an unambiguous string value for `Code`.

You can use either angle brackets `<>` or curly brackets `{}` when defining inline schemas.

### Units of measure

Inline schemas also enable support for units of measure.

In the previous example, the `Length` property is now inferred as a `float`
with the `metre` unit of measure (from the default SI units).

Warning: units of measures are discarded when merged with types without a unit or with a different unit.
As mentioned previously, with the `ValuesAndInlineSchemasHints` inference mode,
inline schemas types are merged with other inferred types with the same precedence.
Since values-inferred types never have units, inline-schemas-inferred types will lose their
unit if the sample contains other values...

*)

(**

## Loading WorldBank data
Expand Down Expand Up @@ -278,7 +365,7 @@ The `RetweetCount` and `Text` properties may be also missing, so we also access
## Getting and creating GitHub issues

In this example we will now also create JSON in addition to consuming it.
Let's start by listing the 5 most recently updated open issues in the FSharp.Data repo.
Let's start by listing the 5 most recently updated open issues in the FSharp.Data repository.

*)

Expand Down
94 changes: 89 additions & 5 deletions docs/library/XmlProvider.fsx
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,16 @@ Formatter.Register(fun (x:obj) (writer: TextWriter) -> fprintfn writer "%120A" x

This article demonstrates how to use the XML Type Provider to access XML documents
in a statically typed way. We first look at how the structure is inferred and then
demonstrate the provider by parsing a RSS feed.
demonstrate the provider by parsing an RSS feed.

The XML Type Provider provides statically typed access to XML documents.
It takes a sample document as an input (or document containing a root XML node with
multiple child nodes that are used as samples). The generated type can then be used
to read files with the same structure. If the loaded file does not match the structure
of the sample, a runtime error may occur (but only when accessing e.g. non-existing element).
to read files with the same structure

If the loaded file does not match the structure of the sample, a runtime error may occur
(but only when explicitly accessing an element incompatible with the original sample — e.g. if it is no longer present)

Starting from version 3.0.0 there is also the option of using a schema (XSD) instead of
relying on samples.

Expand Down Expand Up @@ -125,7 +128,86 @@ for v in Test.GetSample().Values do
The type provider generates a property `Values` that returns an array with the
values - as the `<value>` nodes do not contain any attributes or children, they
are turned into `int` values and so the `Values` property returns just `int[]`!
*)

(**
## Type inference hints / inline schemas

Starting with version 4.2.10 of this package, it's possible to enable basic type annotations
directly in the sample used by the provider, to complete or to override type inference.
(Only basic types are supported. See the reference documentation of the provider for the full list)

This feature is disabled by default and has to be explicitly enabled with the `InferenceMode`
static parameter.

Let's consider an example where this can be useful:

*)

type AmbiguousEntity =
XmlProvider<Sample = """
<Entity Code="000" Length="0"/>
<Entity Code="123" Length="42"/>
<Entity Code="4E5" Length="1.83"/>
""",
SampleIsList = true>
let code = (AmbiguousEntity.GetSamples()[1]).Code
let length = (AmbiguousEntity.GetSamples()[1]).Length

(*** include-fsi-merged-output ***)

(**
In the previous example, `Code` is inferred as a `float`,
even though it looks more like it should be a `string`.
(`4E5` is interpreted as an exponential float notation instead of a string)

Now let's enable inline schemas:
*)

open FSharp.Data.Runtime.StructuralInference

type AmbiguousEntity2 =
XmlProvider<Sample = """
<Entity Code="typeof{string}" Length="typeof{float{metre}}"/>
<Entity Code="123" Length="42"/>
<Entity Code="4E5" Length="1.83"/>
""",
SampleIsList = true,
InferenceMode = InferenceMode.ValuesAndInlineSchemasOverrides>
let code2 = (AmbiguousEntity2.GetSamples()[1]).Code
let length2 = (AmbiguousEntity2.GetSamples()[1]).Length

(*** include-fsi-merged-output ***)

(**
With the `ValuesAndInlineSchemasOverrides` inference mode, the `typeof{string}` inline schema
takes priority over the type inferred from other values.
`Code` is now a `string`, as we wanted it to be!

Note that an alternative to obtain the same result would have been to replace all the `Code` values
in the samples with unambiguous string values. (But this can be very cumbersome, especially with big samples)

If we had used the `ValuesAndInlineSchemasHints` inference mode instead, our inline schema
would have had the same precedence as the types inferred from other values, and `Code`
would have been inferred as a choice between either a number or a string,
exactly as if we had added another sample with an unambiguous string value for `Code`.

### Units of measure

Inline schemas also enable support for units of measure.

In the previous example, the `Length` property is now inferred as a `float`
with the `metre` unit of measure (from the default SI units).

Warning: units of measures are discarded when merged with types without a unit or with a different unit.
As mentioned previously, with the `ValuesAndInlineSchemasHints` inference mode,
inline schemas types are merged with other inferred types with the same precedence.
Since values-inferred types never have units, inline-schemas-inferred types will lose their
unit if the sample contains other values...

*)

(**
## Processing philosophers

In this section we look at an example that demonstrates how the type provider works
Expand Down Expand Up @@ -287,7 +369,7 @@ the lower level APIs.
(**
## Bringing in Some Async Action

Let's go one step further and assume here a sligthly contrived but certainly plausible example where
Let's go one step further and assume here a slightly contrived but certainly plausible example where
we cache the Census URLs and refresh once in a while. Perhaps we want to load this in the background
and then post each link over (for example) a message queue.

Expand All @@ -310,7 +392,7 @@ let cacheJanitor() = async {
(**
## Reading RSS feeds

To conclude this introduction with a more interesting example, let's look how to parse a
To conclude this introduction with a more interesting example, let's look how to parse an
RSS feed. As discussed earlier, we can use relative paths or web addresses when calling
the type provider:
*)
Expand Down Expand Up @@ -645,6 +727,8 @@ Focusing on element shapes let us generate a type that should be essentially the
inferred from a significant set of valid samples. This allows a smooth transition (replacing `Sample` with `Schema`)
when a schema becomes available.

Note that inline schemas (values of the form `typeof{...}`) are not supported inside XSD documents.

## Related articles

* [Using JSON provider in a library](JsonProvider.html#jsonlib) also applies to XML type provider
Expand Down