Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Additional sources of truth for wrapper generation #21

Open
mikolasstuchlik opened this issue Aug 2, 2021 · 3 comments
Open

Comments

@mikolasstuchlik
Copy link
Contributor

mikolasstuchlik commented Aug 2, 2021

Abstract

This issue aims to open a discussion about adopting a second source of truth for wrapper generation. The aim of introducing a second source of truth is the elimination of all post-processing. The second source of truth is the C and Swift API itself.

The current state

Currently, the gir2swift generates wrappers based on input files in the Swift wrapper repository (excluding .awk and .sed files) and .gir files located in the filesystem. The input files in the repository are common for all platforms, whereas the .gir files are expected to reflect the API present on a given platform.

Unfortunately, .gir files are not always a correct reflection of the API present on a given platform[1]. Therefore we need postprocessing and other patches that make the generated Swift code compile.

Interacting with the C API

The Clang compiler used as a part of Swift compiler[2] provides a small API called libclang[3]. The libclang may be used to "parse" the C APIs for us, but we would need to mimic the Clang Importer[4]. At the same time, I could not find any such API for the Swift compiler[5].

One of the most important tools of an IDE is to perform semantic analysis of the program written and provide a user with additional information, like suggestions, warnings, and whether the code is correct. For that purpose, the Swift developers provide a tool called sourcekit-lsp[6]. The sourcekit-lsp can read the C and Swift API and provide an IDE with the aforementioned services.

Taking advantage of the sourcekit-lsp

It would be unreasonable to work with the sourcekit-lsp in the same manner as the IDE (i.e., use the JSON RPC to query the LSP daemon for information.) I have focused on the libraries that serve as the source of truth for the LSP daemon itself.

One such library is indexstore-db[7]. The indexstore-db can interpret the output of the Swift compiler (which creates an index during the compilation)[8]. I have been able to modify a SwiftGLib repository and create a demo project, which prints the list of symbols. https://github.com/mikolasstuchlik/GLibISDemo

I was able to tap only to the surface of the information sourcekit-lsp (at this moment, I am only able to get the list of symbols and their kind), but it should be possible to get things like argument types of a function, the fields of a structure, etc.

Incorporating into the gir2swift

Incorporating the indexstore-db itself (or any other tool used by the sourcekit-lsp) into the gir2swift as a dependency may be tricky. The indexstore-db may depend on a specific version of the Swift compiler and is a rather big dependency.

We could also develop the C API reading part of gir2swift in a separate repository (let's call it a c-api-index) that would run before gir2swift. After it's finished, it would store a metadata file containing all the information we would need to verify the validity of a .gir file. Information such as a list of existing symbols and their kind.

Summary

If the proposed tool is implemented, it will effectively replace the post-processing with a pre-processing. However, the promise of having such a tool is that there would be less need for a manual configuration (and thus maintenance). What other features should such a tool have?

[1] https://discord.com/channels/791751777598570606/791751778092711941/870776594329915505
[2] Clang compiler is distributed with the Swift compiler as a part of the Swift toolchain.
[3] https://clang.llvm.org/doxygen/group__CINDEX.html
[4] https://swift.org/swift-compiler/
[5] I have found libswift https://github.com/apple/swift/tree/286d22b2e6e282ef08dc810580c1ea94aac9ab08/libswift but this module represents a part of the Swift compiler written in Swift.
[6] https://github.com/apple/sourcekit-lsp
[7] https://github.com/apple/indexstore-db
[8] Therefore we need to run the compiler before the wrapper generation process even starts.

@mikolasstuchlik
Copy link
Contributor Author

mikolasstuchlik commented Aug 4, 2021

This issue was discussed on Discord with @kabiroberai .

First message: https://discord.com/channels/791751777598570606/791751778092711941/872087233811853312
Last message: https://discord.com/channels/791751777598570606/791751778092711941/872404886833229824

Summary

Kabir suggested using SourceKit via https://github.com/jpsim/SourceKitten. SourceKitten is a commonly used tool. Kabir managed to create an example on using SwiftGLib:

# macOS
$ sourcekitten module-info --module CGLib -- -I"${PWD}/Sources" $(printf " -Xcc %s" $(pkg-config --cflags gio-unix-2.0)) -sdk "$(xcrun --show-sdk-platform-path)/Developer/SDKs/MacOSX$(xcrun --show-sdk-version).sdk"

# linux
$ sourcekitten module-info --module CGLib -- -I"${PWD}/Sources" $(printf " -Xcc %s" $(pkg-config --cflags gio-unix-2.0))

The output is ~800K lines long JSON, which contains embedded XML strings. It will take some time until I get familiar with the data structure.

The SourceKitten does not require the package to be built, but it needs path to the libsourcekitdInProc.so on Linux and the SDK on macOS.

@mikolasstuchlik
Copy link
Contributor Author

mikolasstuchlik commented Sep 9, 2021

#Linux
sourcekitten module-info --module Swift -- ""
sourcekitten module-info --module Foundation -- ""
sourcekitten module-info --module Glibc -- ""
sourcekitten module-info --module CGLib -- -I"$PWD"/Sources $(printf " -Xcc %s" $(pkg-config --cflags gio-unix-2.0))

#macOS
sourcekitten module-info --module Swift -- -sdk "$(xcrun --show-sdk-platform-path)/Developer/SDKs/MacOSX$(xcrun --show-sdk-version).sdk"
sourcekitten module-info --module Foundation -- -sdk "$(xcrun --show-sdk-platform-path)/Developer/SDKs/MacOSX$(xcrun --show-sdk-version).sdk"
sourcekitten module-info --module Darwin -- -sdk "$(xcrun --show-sdk-platform-path)/Developer/SDKs/MacOSX$(xcrun --show-sdk-version).sdk"
sourcekitten module-info --module CGLib -- -I"${PWD}/Sources" $(printf " -Xcc %s" $(pkg-config --cflags gio-unix-2.0)) -sdk "$(xcrun --show-sdk-platform-path)/Developer/SDKs/MacOSX$(xcrun --show-sdk-version).sdk"

@mikolasstuchlik
Copy link
Contributor Author

For anyone interested, a MR mikolasstuchlik/g2swift#1 has been merged into my PoC repository g2swift. It implements a call to SourceKit and parses it's output.

This demonstrates how responses from SourceKit could be parsed. In future, I'll try to get a useful information about the environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant