Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Restructure DataFusion site #1821

Closed
matthewmturner opened this issue Feb 13, 2022 · 4 comments
Closed

Proposal: Restructure DataFusion site #1821

matthewmturner opened this issue Feb 13, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@matthewmturner
Copy link
Contributor

matthewmturner commented Feb 13, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
(This section helps Arrow developers understand the context and why for this feature, in addition to the what)

I am proposing to restructure the DataFusion site drawing some inspiration from the Arrow site (https://arrow.apache.org/docs). Assuming we reach consensus on some variant of this I think this issue would be best served as a parent / tracker since completing this will have many changes.

Describe the solution you'd like
A clear and concise description of what you want to happen.

I think it would be nice if the structure was something like the below. This is very rough right now (and im not familiar enough with the complete installation to fill in all details) but I think illustrates an overall structure that would be nice - IMHO.

  • About
    • Quarterly Roadmap
    • Long Term Roadmap
  • Supported Environments
    • Rust
      • Installation
      • Key Interfaces
        • ExecutionContext
        • DataFrame
        • ObjectStore
        • ListingTable
        • ...
      • Internals # I figure since Rust is the foundation for everything else its worth explaining some of the internals???
        • Logical Plan
        • Physical Plan
        • Expressions
        • ...
      • Extensions
      • API Reference
      • ...
    • Python
      • Installation
      • API Reference
      • ...
    • Java???
    • CLI
      • Installation
      • ...
    • Ballista
      • Installation
      • ...
  • Cookbook
    • Rust
    • Python
    • CLI
    • Ballista
  • Development
    • Rust
    • Python
    • CLI
    • Ballista
  • Specification
    • DataFusion's invariants
    • DataFusions output field name semantics
    • SQL Reference
  • Community
    • Communication
    • Issue tracker
    • Code of conduct
    • DataFusion-contrib
    • FAQ

I personally really like having Cookbooks to learn the idiomatic / performant ways to perform various tasks (see https://arrow.apache.org/cookbook/py/ as an example). That being said I recognize it would be extra work and likely isnt needed in the short run until the core structure is in place. But in the medium to longer term I think there is real value in it - in particular for new users.

Similar can be said for the Internals section in rust. That being said, I believe @alamb and @andygrove have already produced some presentations / talks that could potentially be leveraged to help fill in those details (on top of whatever is in docs.rs).

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

I have a few datafusion features I plan on working on in the short-medium term but after those are completed i would be happy to drive the implementation of this once there is agreement.

@matthewmturner matthewmturner added the enhancement New feature or request label Feb 13, 2022
@matthewmturner
Copy link
Contributor Author

This is still rough, but I've been reviewing the site and wanted to get down in writing some of my ideas. I dont think any of this would make the 7.0 release but let me know if anyone thinks otherwise.

FYI - and interested in your thoughts @alamb @houqp @andygrove @jimexist @Dandandan @xudong963

@matthewmturner matthewmturner changed the title Proposal: Restrucutre DataFusion site Proposal: Restructure DataFusion site Feb 14, 2022
@alamb
Copy link
Contributor

alamb commented Feb 14, 2022

In general, I like this idea / organization a lot. Thank you for writing it down @matthewmturner

My biggest piece of feedback / request for any reorganization is to ensure we keep the following content clearly delinated:

  1. Content for those who want to use DataFusion in their project (examples, cookbook, etc)
  2. Content for those who want to develop DataFusion

In particular I think it is very important not to mix the developer documentation with the user documentation

That being said, I also think the SQL reference / feature list (being targeted towards users) might be good to feature in a more prominent section (alongside Rust)

I personally really like having Cookbooks to learn the idiomatic / performant ways to perform various tasks (see https://arrow.apache.org/cookbook/py/ as an example). That being said I recognize it would be extra work and likely isnt needed in the short run until the core structure is in place. But in the medium to longer term I think there is real value in it - in particular for new users.

I agree -- one thing I think we could do is to make the existing "examples" more discoverable (you basically now have to check out the code and know where to look): https://github.com/apache/arrow-datafusion/tree/master/datafusion-examples/examples . There are a lot of good examples there.

Similar can be said for the Internals section in rust. That being said, I believe @alamb and @andygrove have already produced some presentations / talks that could potentially be leveraged to help fill in those details (on top of whatever is in docs.rs).

👍

@xudong963
Copy link
Member

Thanks @matthewmturner, make sense to me!

@alamb
Copy link
Contributor

alamb commented Dec 20, 2024

I think the current site, while not quite the proposal of @matthewmturner is much better than when this ticket was written

https://datafusion.apache.org/

Thus let's close this one to clean up the backlog:

@alamb alamb closed this as completed Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants