Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Haskell] Rewrite Syntax #2679

Merged
merged 180 commits into from
Feb 2, 2022

Conversation

deathaxe
Copy link
Collaborator

@deathaxe deathaxe commented Dec 30, 2020

Fixes #1321
Fixes #2670
Fixes #2672
Fixes #2918

Superseeds #2628, #2662, #2671

This PR is the result of some spare time caused by pendemic stay-at-home orders in our country.

With a birds eye on existing Haskell syntax definitions of other popular editors and a quick study of Haskell 2010 Report the proposed syntax definition evolved from the existing package. It includes all commits of #2671 and the changes from #2628.

The major goal is to provide an as simple as possible and robust syntax definition.

Layouts

Haskell supports two kinds of layout schemes
a) a C style block quote layout with braces and ; statement terminators.
b) a python like indentation based layout.

The latter one makes it hard to find proper statement and expression boundaries and seems to be the root cause of broken syntax highlighting with existing implementations.

This is resolved by simplifying contexts and scope names to a minimum.

Patterns

Another issue with existing syntaxes is pattern matching itself. It appears none of them has proper support for unicode identifiers or unicode operators in various situations. That's addressed by this PR as well.

Disclaimer

I am not an active Haskell developer. Thus this PR should probably be thoroughly reviewed by an Haskell expert to make sure all basic language constructs work as expected.

Performance

File lines Before After Diff
syntax_test_haskell.hs 4096 10ms 14ms +40%
LaTeX.hs x 6 12070 53ms 130ms +145%
Parsing.hs x 6 9444 44ms 83ms +88%

The new syntax is about 2x slower depending on use case, but that doesn't surprise much as the old one holds only 407 lines of code, while the new one consists of 1388 lines, with a lot of more stuff being considdered. It uses branching to distinguish groups from tuples and to scope type contexts in order to be able to highlight declared type identifiers correctly. The list of scoped builtins is larger etc.

Fixes sublimehq#2670

According to https://www.haskell.org/onlinereport/haskell2010/haskellch10.html
a comment starts with at least 2 dashes followed by `any` character but
symbols with some exceptions. The relevant rules are:

  comment   -> -- {-} [ any⟨symbol⟩ {any} ] newline
  symbol    -> ascSymbol | uniSymbol⟨special|_|"|'⟩
  special   -> ( | ) | , | ; | [ | ] | ` | { | }
  ascSymbol -> ! | # | $ | % | & | ⋆ | + | . | / | < | = | > | ? | @ | \
             | ^ | | | - | ~ | :
  uniSymbol -> any Unicode symbol or punctuation

With:

  any       -> graphic | space | tab
  graphic   -> small | large | symbol | digit | special | " | '

we can say:

  comment   -> -- {-} [
               space | tab | small | large | digit | special | " | '
               {any} ] newline

This simplified pattern is implemented in this commit.

Additionally `(--)` and `(---)` are excluded from infix operators.
A comment terminates string and character literals.
This commit adds `meta.string` scope to align with other syntax definitions.
This commit scopes single quoted literal characters
`constant.character.literal` to align with recent changes with C#, ... .

That's not directly related with the issue to fix in the PR but test
cases would cause merge conflicts otherwise.
It appears Haskell compiler allows whitespace between backticks.
It's not very obvious but it looks like the specification includes that
as well.
This commit...

1. addresses one issue of sublimehq#2672:
   -> Lines 8,9: the ' not highlighted as part of A

   According to https://www.haskell.org/onlinereport/haskell2010/haskellch10.html

   an identifier may contain any kind of ascii or unicode word character
   including the `'`, which was not implemented before this commit.

   varid →	(small {small | large | digit | ' })⟨reservedid⟩
   conid →	large {small | large | digit | ' }

   This also means `\b` must not be used to terminate patterns as this
   prevents trailing `'` to be matched as part of the identifier.

2. scopes the `.` between module names `punctuation.accessor`
3. scopes the module name itself `variable.namespace`

Note: Fully qualified identifiers are not scoped `meta.path` at this point.
This commit addresses one issue of sublimehq#2672:
-> Line 10: the => context operator not highlighted
This commit is to reduce duplicated patterns by moving those candidates
into contexts which can be included wherever needed.
This commit ...

1. addresses one issue of sublimehq#2672:
   -> Lines 3,4: directives #if and #endif not highlighted
   
   It adds a capture group to scope keywords of C-style preprocessor
   directives.

   Note: This commit doesn't modify the quite simplistic approach to
   match those. So don't expect much. Maybe it can be addressed once the
   C family has been upgraded.

2. The `keyword.other.preprocessor` scope is renamed to
   `keyword.directive.other`, which is the scope used by Erlang and some
   other recently changed syntaxes for such things.
This commit replaces `\b` by `(?![\w'])` as `'` is a valid identifier
character, which would otherwise be matched illegally.

E.g.: `class'` is not a keyword followed by `'` but a normal identifier.
Include `#` into keyword scope.
This commit makes sure to use the same scope for the same thing with
regards to units.
Fixes a regression which broke some highlighting of following
statements silently.
This commit...

1. creates a PREPROCESSOR section
2. moves all relevant contexts into int
3. creates named contexts for all parts
4. moves predefined pragma keys into a variable
5. removes `pragma` context from `type_signature` because preprocessor
   contexts being included via prototype already.
This commit...

1. creates sections for each kind of declaration
2. moves relevant contexts into those sections
3. splits the `declaration` context
This commit scopes the last part of a qualified import module `entity`.
This commit...

1. renames `module_exports` into `symbols`
2. moves it into a dedicated section
3. creates named context for the body part
4. scopes it `meta.sequence.symbols` as it feels weird to have a
   meta.declaration.export in a meta.import statement.
This commit ensures to maintain highlighting with incomplete module
declaration statements.
This commit ...

1. creates a dedicated section for groups and lists
2. renames groups and lists contexts to plural to express non-popping
   behavior
3. creates named contexts for body parts
This commit...

1. creates a dedicated section for identifier contexts
2. moves predefined function pattern into a variable
3. renames contexts to use `-` and plural.
This context...

1. creates a dedicated LITERALS section
2. moves contexts for chars,numbers,strings and language constants into
   the new section
3. renames the contexts to be prepended with `literal-` and use plural.
Moves them right after identifiers as this is the order of rules in the
language specification and tweaks pattern formatting to match the rest.
@deathaxe
Copy link
Collaborator Author

That's true in general. Variables are the way to go with it. They are needed to distinguish builtins from user defined tokens though. (e.g. constant.language vs. constant.other). Already thought about scoping "unknown" directives and values this way, so most users might not see a difference.

On the other hand such lists provide some low-level way to quickly see whether a valid builtin token is entered.

@jpe90
Copy link

jpe90 commented Jul 21, 2021

Are escaped quotes in String literals highlighted correctly?

Here is a screenshot from this PR:
subl_highlighting
And the same code with treesitter in neovim:
nvim_ts_highlighting

@rwols
Copy link
Contributor

rwols commented Jul 21, 2021

That looks correct to me.

@deathaxe deathaxe force-pushed the pr/haskell/rewrite-syntax branch from 82d9274 to c4c0f53 Compare August 19, 2021 09:15
@deathaxe
Copy link
Collaborator Author

Resolved a conflict with #2760.

Copy link
Collaborator

@FichteFoll FichteFoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been through this quite a while ago already. For verification, I just tested this out on a couple Haskell files I wrote by myself a few years ago and everything looked good or at least not worse than before, so this gets a pass from me.

@deathaxe
Copy link
Collaborator Author

Most significant changes in highlighting should be caused by moving from keyword. scopes to punctuation. for most brackets.

@FichteFoll
Copy link
Collaborator

FichteFoll commented Oct 31, 2021

That, plus some function scoping and imports were the most obvious to me.

@pennychase
Copy link

pennychase commented Feb 1, 2022

@jrappen asked me to look at the PR when I asked a question about Haskell and LSP on Discord.. I don't consider myself a Haskell expert, but I have been programming in the language for a couple of years. Overall, I really like the syntax, but there are some things that could be tweaked.

In the module declaration all of the elements of the qualified module name should be highlighted the same way.In this capture the first element is white and the last one is yellow, but they should probably all be yellow:
Screen Shot 2022-01-31 at 10 34 25 PM

And in a qualified import, the elements of the qualified name should also all be highlighted in the same color:
Screen Shot 2022-01-31 at 10 45 09 PM

In type declarations, it looks like some types are considered "keywords" and are blue, while others are purple.
Screen Shot 2022-01-31 at 10 52 40 PM
This is the way Github does the highlighting, although VSCode treats them uniformly.

Similarly in a record definition, the type annotations have different highlighting for keywords and others:
Screen Shot 2022-02-01 at 7 50 29 AM
Again, this is GitHub's approach, while VSCode treats them uniformly.

I could live with either, although I lean towards the uniformity.

In this code, we see that Just and Nothing, the value constructors of the Maybe type are treated differently; Nothing is treated as a value constructor (highlighted purple), but Just seems to be treated as a keyword (highlighted blue). And the value constructors of the Either type, both seem to be treated as keywords (highlighted blue).
Screen Shot 2022-01-31 at 10 55 32 PM
Just and Nothing should be highlighted the same way, since they are the value constructors of Maybe. Again, Left and Right could be highlighted blue as keywords, or purple.

As I said, overall I like the highlighting distinctions (e.g., types are clearly differentiated from constructors in newtype and data declarations; it's neat that turning a function to inbox with backticks is highlighted the same way as an infix operator, and treating escaped characters in strings differently than a string is very helpful). Thanks for doing this.

@BenjaminSchaaf BenjaminSchaaf merged commit c356171 into sublimehq:master Feb 2, 2022
jfcherng added a commit that referenced this pull request Feb 2, 2022
@deathaxe
Copy link
Collaborator Author

deathaxe commented Feb 2, 2022

And in a qualified import, the elements of the qualified name should also all be highlighted in the same color

It's up to the color scheme to choose a color for the given types.

The syntax definition scopes qualifiers as variable.namespace. Only the final token is the real variable or entity being imported. That's the scope naming scheme choosen in all syntaxes.

In type declarations, it looks like some types are considered "keywords" and are blue, while others are purple.

Basic types like String, Int64 are scoped as builtin data type (support.type), while all user defined types are scoped storage.type.

... we see that Just and Nothing ...

Nothing as well as True and False are scoped support.constant, while Just, Left and Right are scoped support.type. Should we scope all of them support.type to denote them being type constructors?

@deathaxe deathaxe deleted the pr/haskell/rewrite-syntax branch February 2, 2022 17:00
@pennychase
Copy link

pennychase commented Feb 4, 2022

I discussed the issue with Nothing, Just, etc in Issue #3227. I think thay all need to be scoped as constructors (they're actually the value constructors for the type). There is the question of differentiating them as built-ins. GitHub appears to treat Left/Right as built-ins and Just/Nothing as ordinary constructors. One could go either way.

Thanks for the explanation about the namespaces, although in a qualified import you're not importing anything, rather you're giving a short name that will be used in code to qualify entities used in code. I'll add this discussion to issue #3227.

mitranim pushed a commit to mitranim/Packages that referenced this pull request Mar 25, 2022
* [Haskell] Fix comments

Fixes sublimehq#2670

According to https://www.haskell.org/onlinereport/haskell2010/haskellch10.html
a comment starts with at least 2 dashes followed by `any` character but
symbols with some exceptions. The relevant rules are:

  comment   -> -- {-} [ any⟨symbol⟩ {any} ] newline
  symbol    -> ascSymbol | uniSymbol⟨special|_|"|'⟩
  special   -> ( | ) | , | ; | [ | ] | ` | { | }
  ascSymbol -> ! | # | $ | % | & | ⋆ | + | . | / | < | = | > | ? | @ | \
             | ^ | | | - | ~ | :
  uniSymbol -> any Unicode symbol or punctuation

With:

  any       -> graphic | space | tab
  graphic   -> small | large | symbol | digit | special | " | '

we can say:

  comment   -> -- {-} [
               space | tab | small | large | digit | special | " | '
               {any} ] newline

This simplified pattern is implemented in this commit.

Additionally `(--)` and `(---)` are excluded from infix operators.

* [Haskell] Fix comments in strings

A comment terminates string and character literals.

* [Haskell] Add meta.string

This commit adds `meta.string` scope to align with other syntax definitions.

* [Haskell] Scope character literals

This commit scopes single quoted literal characters
`constant.character.literal` to align with recent changes with C#, ... .

That's not directly related with the issue to fix in the PR but test
cases would cause merge conflicts otherwise.

* [Haskell] Fix comments in imports

* [Haskell] Fix illegal infix operator highlighting

This commit addresses one issue of sublimehq#2672.

* [Haskell] Allow space in infix operators

It appears Haskell compiler allows whitespace between backticks.
It's not very obvious but it looks like the specification includes that
as well.

* [Haskell] Fix identifier patterns

This commit...

1. addresses one issue of sublimehq#2672:
   -> Lines 8,9: the ' not highlighted as part of A

   According to https://www.haskell.org/onlinereport/haskell2010/haskellch10.html

   an identifier may contain any kind of ascii or unicode word character
   including the `'`, which was not implemented before this commit.

   varid →	(small {small | large | digit | ' })⟨reservedid⟩
   conid →	large {small | large | digit | ' }

   This also means `\b` must not be used to terminate patterns as this
   prevents trailing `'` to be matched as part of the identifier.

2. scopes the `.` between module names `punctuation.accessor`
3. scopes the module name itself `variable.namespace`

Note: Fully qualified identifiers are not scoped `meta.path` at this point.

* [Haskell] Fix arrow operator in class definition

This commit addresses one issue of sublimehq#2672:
-> Line 10: the => context operator not highlighted

* [Haskell] Create ident contexts

This commit is to reduce duplicated patterns by moving those candidates
into contexts which can be included wherever needed.

* [Haskell] Highlight C-style directive keywords

This commit ...

1. addresses one issue of sublimehq#2672:
   -> Lines 3,4: directives #if and #endif not highlighted
   
   It adds a capture group to scope keywords of C-style preprocessor
   directives.

   Note: This commit doesn't modify the quite simplistic approach to
   match those. So don't expect much. Maybe it can be addressed once the
   C family has been upgraded.

2. The `keyword.other.preprocessor` scope is renamed to
   `keyword.directive.other`, which is the scope used by Erlang and some
   other recently changed syntaxes for such things.

* [Haskell] Avoid capture groups

* [Haskell] Fix word break pattern

This commit replaces `\b` by `(?![\w'])` as `'` is a valid identifier
character, which would otherwise be matched illegally.

E.g.: `class'` is not a keyword followed by `'` but a normal identifier.

* [Haskell] Tweak C-style directive keyword scope

Include `#` into keyword scope.

* [Haskell] Fix unit scope consistency

This commit makes sure to use the same scope for the same thing with
regards to units.

* [Haskell] Fix test case

Fixes a regression which broke some highlighting of following
statements silently.

* [Haskell] Add missing {{break}}

* [Haskell] Reorganize Comments

* [Haskell] Reorganize preprocessor contexts

This commit...

1. creates a PREPROCESSOR section
2. moves all relevant contexts into int
3. creates named contexts for all parts
4. moves predefined pragma keys into a variable
5. removes `pragma` context from `type_signature` because preprocessor
   contexts being included via prototype already.

* [Haskell] Reorganize declaration contexts

This commit...

1. creates sections for each kind of declaration
2. moves relevant contexts into those sections
3. splits the `declaration` context

* [Haskell] Tweak import identifier scopes

This commit scopes the last part of a qualified import module `entity`.

* [Haskell] Reorganize export/import symbols

This commit...

1. renames `module_exports` into `symbols`
2. moves it into a dedicated section
3. creates named context for the body part
4. scopes it `meta.sequence.symbols` as it feels weird to have a
   meta.declaration.export in a meta.import statement.

* [Haskell] Improve module declaration bailouts

This commit ensures to maintain highlighting with incomplete module
declaration statements.

* [Haskell] Reorganize type signatures context

* [Haskell] Reorganize groups and lists

This commit ...

1. creates a dedicated section for groups and lists
2. renames groups and lists contexts to plural to express non-popping
   behavior
3. creates named contexts for body parts

* [Haskell] Reorganize identifiers

This commit...

1. creates a dedicated section for identifier contexts
2. moves predefined function pattern into a variable
3. renames contexts to use `-` and plural.

* [Haskell] Reorganize literals

This context...

1. creates a dedicated LITERALS section
2. moves contexts for chars,numbers,strings and language constants into
   the new section
3. renames the contexts to be prepended with `literal-` and use plural.

* [Haskell] Reorganize keywords and operators

* [Haskell] Tweak reserved_id formatting

* [Haskell] Reorganize escape_chars variables

Moves them right after identifiers as this is the order of rules in the
language specification and tweaks pattern formatting to match the rest.

* [Haskell] Rename statement and expression contexts

Use plural to express non-popping behavior.

* [Haskell] Add statement terminators

A semicolon may be used to terminate statement depending on coding style.
It can be omitted if certain layout rules are respected.

* [Haskell] Tweak import declarations

Import declaration statements may span multiple lines. Hence `$` is
removed from bailouts to support that.

Note: fully qualified identifiers still need some tweaks to properly
      scope the leaf.

* [Haskell] Move import declaration tests

Move them right after module declaration tests as this is the order of
contexts in the definition file.

* [Haskell] Scope import keywords `keyword.declaration.import`

* [Haskell] Opt-in to sublime-syntax version 2

There are no incompatible contexts so far. Hence it seems safe to
opt-in to sublime-syntax version 2 now.

* [Haskell] Replace pop: true by pop: 1

Remove legacy stuff from version 2 syntax definitions.

* [Haskell] Add support for code blocks

Depending on code style curly braces may be used to denote block
boundaries.

* [Haskell] Introduce else-pop context

* [Haskell] Fix floating point number scopes

Addresses sublimehq#2630

This commit applies `constant.numeric.value` to the whole value part of
floating point numbers without interrupting by decimal point and moves
the pattern to the `number` context.

* [Haskell] Fix derived statements

This commit ...

1. Creates a dedicated context for inherited entities
2. fixes sequence scopes

* [Haskell] Add preprocessor punctuation scopes

* [Haskell] Refactor type signature contexts

Declarations may contain so called contexts `[context =>]`. That's what
this commit starts to implement in a simplistic way by adding support
for context tuples in general. They are applied to class declarations
as a first step.

* [Haskell] Rename symbols sequence scope to tuple

Now that we learned Haskell to know the concept of tuples, lets scope
those types of sequences as such.

* [Haskell] Remove empty list special pattern

As tuples are scoped `meta.sequence punctuation` only do the same for
lists as well. It feels odd to scope empty lists `constant.language`.

* [Haskell] Add type and newtype declaration statements

This comment makes sure to correctly match class data types and
variables after `type` and `newtype` keyword as they are after `class`.

see: https://www.haskell.org/onlinereport/haskell2010/haskellch4.html

* [Haskell] Fix literal chars vs. operators

This commit removes invalid highlighting from character literals as this
is not the way Haskell compiler works, which caused syntax highlighting
to break in various situations when leading `'` is used as operator.

Several test cases for char literals are added as well as those to
illustrate how `'` applies as operator or part of an identifier.

* [Haskell] Tweak test case section headers

* [Haskell] Fix variable identifier

* [Haskell] Add some specification references

* [Haskell] Fix list and tuple constructor highlighting

This commit...

1. removes `,` from operators as `(,)` is a tuple constructor such as
   the unit expression `()` is.
2. sorts block/group/list/tuple contexts by docs headline numbers.
3. uses branching to distinguish groups and tuples.
4. adds several test cases to verify the changes.

Note: Original syntax of ST and VS Code scopes such constructors
      `constant...`, while this commit handles those as "normal" lists
      and tuples.

* [Haskell] Highlight fully qualified identifiers

* [Haskell] Fix exported module symbol

A module declaration's export symbol list may contain another module
declaration, which forwards its content.

* [Haskell] Add expression type signatures

According to Chapter 3 'Expressions' any expression may be followed by
a expression type signature which is denoted by `::`.

Also scope arrows `keyword.operator` as this is how syntax calls them.

* [Haskell] Fix test case indentation

All tested lines are indented 4 chars.

* [Haskell] Fix default declaration statements

* [Haskel] Fix deriving declaration statements

A `deriving` statement may contain a single constructor identifier or a
tuple of constructors.

* [Haskell] Add some keyword boundary tests

* [Haskell] Add data declaration statements

* [Haskell] Fix instance keyword scope

* [Haskell] Reorganize identifier section

* [Haskell] Fix scope of where keyword

The `where` keyword has the same meaning no matter where it is used.

* [Haskell] Reorganize keywords

This commit...

1. removes duplicated keywords which are already matched in `statements`
2. sorts them logically (by scope)

* [Haskell] Fix symbol and operator patterns

This commit implements operators by strictly following syntax
specification in order to fix various mismatches.

The chapters in question are:

  2.2 Lexical Program Structure
  2.4 Identifiers and Operators

  https://www.haskell.org/onlinereport/haskell2010/haskellch2.html

* [Haskell] Fix empty and unicode character literals

It turned out the old pattern to only support some ascii characters.
This commit strictly implements specifications to fix that.

* [Haskell] Reorganize newtype test cases

* [Haskell] Improve class declaration

* [Haskell] Split import/export symbol lists

As `module` and qualified identifiers are not permitted in import lists
this commit splits `symbols` context to be able to correctly handle that.

* [Haskell] Remove where keyword from declaration scope

* [Haskell] Tweak deriving contexts

* [Haskell] Tweak block comment contexts

* [Haskell] Scope builtin constants

* [Haskell] Improve preprocessor statements

* [Haskell] Add file extensions

* [Haskell] Tweak function declaration

Bailout from type signature as soon as the function name is found on
the next line.

* [Haskell] Add instance declaration tests

* [Haskell] Fix type signatures

Sequence separators are supported in type tuples/lists only, but not in
top-level type signature content.

* [Haskell] Improve default statement

Ensures to pop right after the tuple.

* [Haskell] Add forall keyword highlighting

* [Haskell] Rework Haskell Literate

This commit ...

1. Derives Haskell Literate.sublime-syntax from LaTeX.sublime-syntax
   -> Embedded Haskell code blocks are now supported everywhere.
2. Embeds Haskell.sublime-syntax rather than importing it to let ST
   re-use existing definitions and support lazy loading.
3. Adjusts scope names to align with LaTeX.sublime-syntax
4. Add a test file.

* [Haskell] Improve type/newtype declarations

The right hand side after `=` is a type signature. Split type and
newtype declarations as they contain slightly different syntax.

* [Haskell] Add comment test case

Added from another PR.

* [Haskell] Distinguish top-level declarations and statements

Most declarations, such as classes, data, imports, functions, ... may
only appear as top-level statements, which are not supported in nested
code blocks.

This commit therefore avoids matching those top-level declarations in
nested code blocks to reduce false positives and improve performance.

* [Haskell] Add signature statements support

see: https://wiki.haskell.org/Module_signature

* [Haskell] Fix nested record field declarations

Braces within groups and lists are record fields.

* [Haskell] Remove ident-constants

It appears Haskell only knows about variables vs. constructors.

Constructors are data types, either classes or user defined types.
Scoping those constant.other therefore feels odd.

There are some constructors such as `True`, `False` or `Nothing` which
are used as constants though.

This commit therefore scopes all constructors as storage.type except the
ones known to have constant character. It helps keeping syntax
highlighting consistent in ambiguous situations.

* [Haskell] Rename meta.name to variable.other

This commit scopes arbitrary variable identifiers `variable.other`
instead of `meta.name` as meta scopes should not be used to scope
single tokens/identifiers.

Identifiers can be variables or functions, we can't distinguish.

* [Haskell] Simplify `variable.other.generic-type` scope

This commit removes `generic-type` from type variables as they are
handled like any other variable in context of type signatures.

Primary goal is to reduce syntax complexity and maintain scope and
highlighting consistency in ambiguous statements/expressions.

* [Haskell] Fix comments in quoted strings

It turns out dashes in quoted strings (e.g.: "--string") don't start
comments. The assumption was wrong.

* [Haskell] Improve prelude types and variables

This commit...

1. renames prelude/builtin variables
2. adds various builtin classes and types
3. adds missing (optional) keywords
4. reorganizes the identifiers section
5. scopes all prelude classes/constants/types `support....`

Note: `otherwise` and `return` are prelude/builtin functions but no
      reserved keywords, hence they are removed from `keywords` context.

* [Haskell] Common module/import scopes

This commit simplifies identifier patterns in module declarations and
import statements. All path elements use `ident-namespaces` context and
the last element is scoped `entity.name` no matter whether it is aliased
via `as` keyword or followed by import filters `(...)`.

* [Haskell] Add data family modifiers

* [Haskell] Add foreign import/export statements

https://wiki.haskell.org/Keywords#foreign

* [Haskell] Restrict snippet scopes

* [Haskell] Add class-block

* [Haskell] Ensure reserved unicode operator scopes

* [Haskell] Rework function declarations

This commit ...

1. uses branching to detect function declaration names.

   Any variable or infix operator followed by `::` is a function
   definition in top-level and class-level statements.

2. removes type signature from function declarations as it is hard and
   error prone to find their end. It's also no longer required as type
   variables are scoped as `variable.other` as everything else.

This fixes highlighting of function declarations ...
a) whose type signature starts at the next line
b) with a list of function identifiers followed by `::`
c) which don't start at the beginning of a line due to block layout.

* [Haskell] Tweak type-content

* [Haskell] Add extra bailouts to lists,groups,tuples

* [Haskell] Simplify class declarations

It is no longer needed to distinguish context and signature.

* [Haskell] Simplify data declarations

* [Haskell] Simplify derived() and via() statements

Replace `entity.name.inerhited-class` by `storage.type` as those tokens
may be present at other places, too. The `entity.` scope is of no use in
this syntax definition, while scoping all data types the same way makes
it easier to maintain consistent highlighting in hashed color schemes.

* [Haskell] Reorganize records contexts

* [Haskell] Simplify type declarations

* [Haskell] Add tests to verify instance methods

* [Haskell] Distinguish type groups and type tuples

This commit intends to create consistent group vs. tuple highlighting in
both, normal and type expressions.

* [Haskell] Simplify type signatures

* [Haskell] Fix infix declarations

`infix` is a declaration keyword, but no operator.

* [Haskell] Tweak quoted infix operators

* [Haskell] Add a sophisticated group test

* [Haskell] Tweak module and import statements

* [Haskell] Simplify quasi quotes

* [Haskell] Rename type-content

* [Haskell] Rename variable-prefix from builtin_ to prelude_

* [Haskell] Merge sequence separator patterns

* [Haskell] Add a note upon forall .

* [Haskell] Update Symbol Lists

This commit...
1. Removes the normal Symbol List.tmPreferences as function declarations
   are already covered by ST's Default package. That's possible because
   function declarations are scoped `meta.function entity.name.function`.
2. Adds a symbol list for module declarations.

* [Haskell] Tidy up tmPreferences

This commit...
1. renames Indent Patterns to `Indentation Rules.tmPreferences`
2. removes name parts from tmPreferences files.

* [Haskell] Remove word separator settings

As `'` is used to denote character literals it should probably not be
removed from word_separators, even though it is a legal identifier
character as well.

* [Haskell] Fix module scope in export list

* [Haskell] Tweak module declaration statement

Align context usage strategy and scope boundaries with class declarations.

* [Haskell] Scope Just/Left/Right as prelude type

Those are builtin constructors of special meaning but no constants.

* [Haskell] Add some pattern/function binding tests

This commit adds some missing tests for bindings within classes.

A statement like `variable { patterns } | { guards } =` maybe a pattern
binding or a function binding. A pattern or guard may consist of
arbitrary expressions. It's therefore hard to distinguish them, without adding sophisticated context switches which tend to be error prone due
to lack of reliable boundary detection.

Thus both use `variable.other` only, atm.

see:
4.4.3 Function and Pattern Bindings
https://www.haskell.org/onlinereport/haskell2010/haskellch4.html#x10-800004.4

* [Haskell] Add exported type test

Add a test case to ensure scope of exported data types.

Exports are considered references and thus don't use `entity.name`.

* [Haskell] Limit constant scope in pragmas to OPTIONS

* [Haskell] Fix infix operators in parentheses

Operators may be wrapped into parentheses to use them in uncommon
positions, but this does not turn the whole parenthesis expression into
a function. It keeps a normal group which just contains a single
operator.

* [Haskell] Add tests for `let` expressions

This commit adds some tests to illustrate some sophisticated situations
which would need to be handled in a more complex implementation.

* [Haskell] Add some real world function declaration tests

This is to illustrate what complex constructs need to be handled well.

* [Haskell] Highlight SPECIALIZE pragma values

* [Haskell] Scope exported symbols `entity.name.export`

This commit scopes all exported entities the same way so that we can
create an "Symbol List - Exports". We don't care about a possible type
an exported symbol is of.

Note: ST doesn't provide kind info in Goto Symbol Quick Panels so far.

* [Haskell] Rework import statements

This commit...

1. Adds detailed `meta.import.[module|alias|filter]` scopes to each
   import term.
2. Scopes the whole import module identifier `variable.namespace`
   because `entity.name.namespace` caused all imports to be globally
   indexed, which is obviously not useful and correct.
3. Scopes all kinds of imported symbols `entity.name.import` without
   respect of their possible type.

* [Haskell] Add support for MagicHash

see: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/magic_hash.html

* [Haskell] Add operators to indexed reference list

* [Haskell] Add operator declarations to indexed symbol list

* [Haskell] Tweak infix operators

This commit...

1. scopes the whole backticked region (e.g.: ` opid `) as `meta.infix`
2. restricts `keyword.operator` to `opid` only
3. adapts punctuation scopes.

* [Haskell] Scope forall unicode keyword the same as its ascii counterpart

* [Haskell] Scope forall-expression terminator as punctuation

* [Haskell] Add exported operators to index

* [Haskell] Add missing builtin unboxed types

* [Haskell] Add sql like list comprehension keywords

* [Haskell] Add binary integer and hexadecimal float literals

* [Haskell] Add support for underscore in numeric linterals

see: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/numeric_underscores.html

* [Haskell] Add "Safe" pragma constant

see: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/safe_haskell.html

* [Haskell] Add safe imports

* [Haskell] Remove obsolete context

* [Haskell] Update pragma value highlighting

* [Haskell] Add SPECIALIZE INLINE pragma support

* [Haskell] Add SPECIALIZE instance pragma support

* [Haskell] Add COMPLETE pragma

* [Haskell] Add COLUMN pragma

* [Haskell] Tweak symbol list and index definitions

This commit...

1. moves all index definitions to:
   a) Indexed Symbol List
   b) Indexed Reference List

2. uses "Symbol List - ...tmPreferences" for symbol list definitions,
   only.
3. removes "Symbol List - Modules" as `entity.name.namespace` is
   already defined as symbol/index by Default.sublime-package.
4. Adds some entries to syntax tests to verify goto definition.

* [Haskell] Tweak Pragma Directives

1. Sort directives by definition in GHC 6.20
2. Remove old misspelled SPECIALISE directive
3. rename pragma_keys to pragma_directives

* [Haskell] Improve FFI capabilities

* [Haskell] Tweak specification comments

Mark all links to Haskell's Gitlab User Guide as `GHC`.

* [Haskell] Tweak prefix operators

This commit tries to create parity with backticked infix operators.

As those are scoped `meta.infix` this commit turns operators in prefix position into `meta.prefix` rather than scoping them as ordinary group.

This enables color schemes to apply special highlighting if desired.

* [Haskell] Tweak type definition identifier scopes

This commit...

1. scopes declared type/newtype identifiers `entity.name.type`
2. adds those to symbol list and indexed symbol list.

* [Haskell] Tweak data definition identifier scopes

This commit scopes the first constructor identifier after `data` keyword
`entity.name.type`.

* [Haskell] Fix data definition identifier context

Now that we scope declared data types `entity.name` we need to make sure
the `context =>` part is parsed correctly. Otherwise the wrong token may
be scoped as `entity.name`.

* [Haskell] Fix type definition identifier context

Now that we scope declared data types `entity.name` we need to make sure
the `context =>` part is parsed correctly. Otherwise the wrong token may
be scoped as `entity.name`.

* [Haskell] Scope class/instance declaration identifiers

This commit scopes declared class and instance types `entity.name.class`.

* [Haskell] Move tests to dedicated directory

* [Haskell] Move `otherwise` to constants

sublimehq#2679 (comment)

* [Haskell] Scope fully qualified infix operators

Resolves sublimehq#2679 (comment)

* [Haskell] Scope list and tuple constructors as constants

This commit scopes `()`, `(,)` and `[]` as language constants.

Satisfies sublimehq#2679 (comment)

* [Haskell] Tweak scopes for empty tuples and lists

This commit...

1. partly reverts the former commit
2. scopes all empty
   * tuples `()` meta.sequence.tuple.empty
   * lists `[]` meta.sequence.list.empty

   This way those are scoped as any other sequence including punctuation
   by default, while also enabling color schemes to highlight those in
   special ways without stacking too many scopes onto each other which
   might cause conflicts with regards to `constant` vs. `punctuation`.

   Such strategies also failed in other situations such as function
   declarations and may not ease color scheme development.

   Note: 
   The `empty` sub-scope is used for such tokens in python and PHP too.

related discussion:
 - sublimehq#2679 (comment)

* [Haskell] Scope :: as punctuation.separator.type

The new scope is also used by C#, TypeScript, PHP and Rust for tokens
of comparable meaning.

Related discussion:
sublimehq#2679 (comment)

* [Haskell] Scope => punctuation.separator.type.context

The "big arrow" `=>` is used to terminate a type's context expression.
This commit interprets it as separator between the context and the rest
of the type expression.

Related discussion:
sublimehq#2679 (comment)

* [Haskell] Fix anonymous variables

Underscore has special meaning and may not be suffixed by unboxed
modifier.

fixes: sublimehq#2679 (comment)

* [Haskell] Fix lists vs. quasi-quotations

Fixes sublimehq#2679 (comment)

* [Haskell] Add "type family" support

Fixes sublimehq#2679 (comment)

* [Haskell] Fix identifiers directly after numbers

Resolves sublimehq#2679 (comment)

* [Haskell] Add GHC 9 LANGUAGE pragma values

Resolves sublimehq#2679 (comment)

* [Haskell] Add profiling inline directives

* [Haskell] Add more LANGUAGE flags

* [Haskell] Add overloaded and typed quotations

Added language feature of GHC 9

see: https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0246-overloaded-bracket.rst

* [Haskell] Add first_line_match

* [Haskell] Fix multi-line strings

Fixes sublimehq#2918

* [Haskell] Improve multi-line strings fix
@moodmosaic
Copy link
Contributor

Great work, all! Was this released with BUILD 4142?

@jrappen
Copy link
Contributor

jrappen commented Nov 15, 2022

as far as I can tell 4130+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet