Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C# support #156

Closed
patrickt opened this issue Jun 14, 2019 · 36 comments
Closed

C# support #156

patrickt opened this issue Jun 14, 2019 · 36 comments
Labels
enhancement New feature or request help wanted Up for grabs language-support Language support in general (e.g. new languages, etc.)

Comments

@patrickt
Copy link
Contributor

A lot of people asked for this on Twitter.

@patrickt patrickt added enhancement New feature or request language-support Language support in general (e.g. new languages, etc.) labels Jun 14, 2019
@robrix robrix added the help wanted Up for grabs label Jun 14, 2019
@KvanTTT
Copy link

KvanTTT commented Jul 20, 2019

Maybe it makes sense to use Roslyn code analyzer internally.

@warrenbuckley
Copy link

What can we the community do, to help move this forward @patrickt & @robrix ?

@robrix
Copy link
Contributor

robrix commented Aug 19, 2019

Thanks for asking! Improvements to the C# parser are probably the first step, see https://github.com/github/semantic/blob/master/docs/adding-new-languages.md for more details.

There’s still more work to be done on the semantic side to actually generate tags for precise ASTs (vs. our existing support for à la carte ASTs), so that would be another step, albeit a more specialist one; to be clear, we intend to do this ourselves as well, but it’s not at the head of the queue just yet 😊

@jongalloway
Copy link
Contributor

I saw there's a partially complete tree-sitter for C# here: https://github.com/tree-sitter/tree-sitter-c-sharp

@petrroll
Copy link

petrroll commented Aug 30, 2019

Speaking of grammars, there's official and up to date grammar for C# generated here: https://github.com/dotnet/roslyn/blob/master/src/Compilers/CSharp/Portable/Generated/CSharp.Generated.g4

@patrickt
Copy link
Contributor Author

patrickt commented Nov 4, 2019

Converting the above grammar to tree-sitter would be straightforward, if uninteresting, to do manually, and could possibly be automated. Note that we can generate tags for precise ASTs now, so semantic could do useful work with C# given a sufficiently-complete grammar.

@damieng
Copy link

damieng commented Nov 11, 2019

@patrickt I actually tried converting the ANTLR C# grammar to tree-sitter a couple of years back but there are a whole bunch of issues it causes documented in the tree-sitter docs here https://tree-sitter.github.io/tree-sitter/creating-parsers#writing-the-grammar

@damieng
Copy link

damieng commented Nov 11, 2019

I'm going through the C# tree-sitter grammar this week to document the missing pieces so they can be put 'up for grabs'. There's not much left to do but some of the actual statement-level stuff requires some thought around order of precedence that isn't part of the official C# grammar so needs to be figured out.

@Danielku15
Copy link

The ANTLR Grammar in the Roslyn repo is generated with this tool.

I think the goal should be to use the roslyn capabilities to auto generate as much as possible to keep everything up to date with the evolution of C#. Maybe it is even worth to open a feature request on their side so that they really maintain the grammar. I mean, GitHub belongs to Microsoft, I bet they're happy to invest these resources to support C# and VB properly in the code navigation of GitHub 😄

@petrroll
Copy link

petrroll commented Nov 15, 2019

I think the the grammar is automatically updated but @CyrusNajmabadi should know more.

@CyrusNajmabadi
Copy link

"I the the grammar"?

@CyrusNajmabadi
Copy link

Let me know what info i can assist withj. as @petrroll mentioned, i wrote the tool to take the official syntactic model for the Roslyn C# system to produce the .g4 grammar for it. I'm also well versed in the actual official language-spec grammar, the defacto parser impl, as well as tons of the quirky corner cases the language has that make it interesting for parsing tools.

Cheers!

@CyrusNajmabadi
Copy link

CyrusNajmabadi commented Nov 15, 2019

Maybe it is even worth to open a feature request on their side so that they really maintain the grammar

The grammar is maintained. And it's done so in an automated fashion**. In fact, it's generated from the actual impl that us used by roslyn to parse and produce a syntactic model used by teh compiler and IDE tooling. So, in effect, this grammar is the most real representation of what C# actually is. i.e. the spec might be incorrect or out of date. This grammar is truly what roslyn is saying its syntactic model can represent.

** Specifically, when any changes happen to the syntactic-model in any Roslyn branch, this tool will take that and regen the grammar. That means you can go into any in-development branch and even see the grammar changes that are happening. For example, in the https://github.com/dotnet/roslyn/tree/features/local-function-attributes branch, you can see that the grammar/syntactic-model now supports all these changes:

dotnet/roslyn@master...features/local-function-attributes#diff-2b6f98b9a71d0ace7649354169675100

Because this is generated automatically, and from the official syntactic model, it's always maintained and always correct for whatever git view of the world you're looking at.

I hope this helps :)

@CyrusNajmabadi
Copy link

Looking at tree-sitter, my grammar is definitely not applicable. For example, it's not LR(1). I despise systems that enforce these sorts of arbitrary restrictions on your grammar. It's not hard to parse arbitrary CFGs. We're not in the 60s where it was absolutely critical to save every byte, and things like lookahead would kill your performance. We have the perf to generate parsers from arbitrary CFGs that can still operate on the largest reasonably files you find in practice in milliseconds.

Anyways, rant over. I just wish people would focus on tools that fit the natural grammars that languages want to be written in, rather than forcing people to contort their grammars into the limited ones that tools have support for :)

@petrroll
Copy link

I the grammar need more sleep otherwise I start dropping random words in sentences.

@damieng
Copy link

damieng commented Nov 15, 2019

Looking at tree-sitter, my grammar is definitely not applicable. For example, it's not LR(1).

That's what #156 (comment) was getting at.

I'm going to go through tree-sitter-c-sharp and break out the remaining pieces. If we want C# syntax parsing it's the quickest way forward. It is probably 75-85% done.

@CyrusNajmabadi if you look into the tree-sitter docs you'll see why this approach was taken. Tree-sitter is an incremental parser originally designed for Atom etc. and needs to recover from invalid state as well as support snippet formatting. It's not supposed to be a 100% map to a formal grammar the compiler uses.

@damieng
Copy link

damieng commented Dec 1, 2019

The C# tree-sitter grammar is now parsing 99.7% of the 1756 c# source files that make up two popular good-sized .NET libraries (JSON.NET and NUnit).

@damieng
Copy link

damieng commented Feb 12, 2020

@robrix Is there anything else you need from the tree-sitter C# grammar to help push this forward?

@patrickt
Copy link
Contributor Author

patrickt commented Apr 8, 2020

@damieng Glad to see that the tree-sitter C# grammar is doing well! The next piece would be to create a new tree-sitter-c-sharp Haskell project within https://github.com/tree-sitter/haskell-tree-sitter/ - this is pretty straightforward, and can be largely copy-pasted from the other such projects. (All those packages need to expose is a getNodeTypesFile and the bridged FFI grammar.

(Apologies for the late response; Rob is on leave.)

@damieng
Copy link

damieng commented Apr 9, 2020

In-progress PR at tree-sitter/haskell-tree-sitter#276

@rygwdn
Copy link

rygwdn commented May 20, 2020

I see that tree-sitter/haskell-tree-sitter#276 was merged. What's next for this issue?

@patrickt
Copy link
Contributor Author

@rygwdn We need a semantic-c-sharp package in github/semantic, along the lines of #551.

@damieng
Copy link

damieng commented May 23, 2020

I've been working on this but the tree-sitter-c-sharp dependency actually fails.

Does the tree-sitter-c-sharp package need to be published somewhere for this to work?

@patrickt
Copy link
Contributor Author

@damieng Use Cabal’s source-repository-package feature and pin it to a Git hash: https://cabal.readthedocs.io/en/latest/cabal-project.html#specifying-packages-from-remote-version-control-locations

@warrenbuckley
Copy link

warrenbuckley commented Aug 4, 2020

With the blog post for improved language support - does this mean anything the community can do to help move the C# language feature along in anyway or is the GitHub team actively working on C# lang support @patrickt ?

https://github.blog/2020-08-04-codegen-semantics-improved-language-support-system/

Also with Microsoft's Intellicode Rich Code Navigation Private Preview - it would be something that would be able to analyse/index the C# files and potentially help with this at all?

https://visualstudio.microsoft.com/services/rich-code-navigation/

https://github.com/microsoft/RichCodeNavIndexer

/cc @jepetty perhaps you could advise/help here?

@mdentremont
Copy link

Is there any chance progress has been made? (don't mean to pester, we're in the middle of switching to GitHub and this will help a lot)
Thanks!

@patrickt
Copy link
Contributor Author

@mdentremont Apologies, but we’re still working out issues on our backend—storing this information takes a great deal of space. I will update this issue when we’re ready! For the time being, those interested should take a look at the tree-sitter query, as our inexact (fuzzy) code navigation has been ported to use tree-sitter queries. However, future work will depend on semantic integration, so this issue is still worth pursuing.

@bas
Copy link

bas commented Sep 10, 2020

Great to see there is some progress as an Azure DevOps customer (Vestas) requested this feature in discussions around moving to GitHub.

@am11
Copy link

am11 commented Sep 22, 2020

Code Navigation option is now enabled for C# sources on GitHub, but the language is not yet listed in the docs.
image

@slang25
Copy link
Contributor

slang25 commented Sep 22, 2020

It was also enabled last week under https://github.com/dotnet/aspnetcore but then disabled again the same day, I suspect there is testing happening on a handful of repos?

@bas
Copy link

bas commented Sep 23, 2020

@am11 that is great, I can see it is enabled on the roslyn repo. What is the status of this? Will it become available to all rpeos any time soon?

@patrickt
Copy link
Contributor Author

Stay tuned… 🤫

@patrickt
Copy link
Contributor Author

patrickt commented Oct 19, 2020

Check it out, y’all: https://github.blog/changelog/2020-10-19-code-navigation-for-c-repositories/

@rygwdn
Copy link

rygwdn commented Oct 21, 2020

@patrickt does anything need to be done to enable it? It doesn't appear to work yet in my organization's main repo.

@dcreager
Copy link
Contributor

@rygwdn Have you pushed recently to the repos in question? Right now our indexing pipeline is only triggered when a new push arrives. If that doesn't work you can open a support ticket with your org / repo names and we can trigger an index run manually.

@damieng
Copy link

damieng commented Jun 17, 2021

I think this issue can be closed now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Up for grabs language-support Language support in general (e.g. new languages, etc.)
Projects
None yet
Development

No branches or pull requests