Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

querys #13

Open
Flipe-TI opened this issue Dec 10, 2024 · 3 comments
Open

querys #13

Flipe-TI opened this issue Dec 10, 2024 · 3 comments

Comments

@Flipe-TI
Copy link

Hi @Hugoberry,
First of all, many thanks for this library! This is very useful for me.
I'm wondering if you have plans to add functionality for specifically querys?

@Hugoberry
Copy link
Owner

Hey, @Flipe-TI, what sort of query are you looking for? There is no way of running DAX against a PBIX file if that's your ask.

@Flipe-TI
Copy link
Author

Flipe-TI commented Dec 12, 2024

Hi @Hugoberry,
What I meant was queries that generate the M code. For example, I have an M code that retrieves data from my DataWarehouse using a query like this example:
code power query:

let
    Source = Sql.Database("ExampleServer", "ExampleDatabase", [Query="SELECT#(lf)      EmployeeID#(lf)      ,FirstName#(lf)      ,LastName#(lf)      ,JobTitle#(lf)      ,HireDate#(lf)      ,Department#(lf)      ,Salary#(lf)      ,CASE WHEN Salary > 50000 THEN 'High' #(lf)#(tab)#(tab) ELSE 'Low' END AS SalaryLevel#(lf)      ,CASE WHEN JobTitle = 'Manager' THEN 'High' #(lf)#(tab)#(tab) ELSE 'Medium' END AS HierarchyLevel#(lf)FROM #(lf)#(tab)[Employees]#(lf)WHERE#(lf)#(tab)[Department] = 'Sales'#(lf)#(tab)AND [Salary] > 40000#(lf)#(tab)AND YEAR(HireDate) >= 2015#(lf)#(tab)AND FirstName LIKE '%John%'#(lf)ORDER BY #(lf)#(tab)Salary DESC"]),
    #"Duplicated Column" = Table.DuplicateColumn(Source, "HireDate", "HireDate - Copy"),
    #"Renamed Columns" = Table.RenameColumns(#"Duplicated Column",{{"HireDate - Copy", "OriginalHireDate"}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"OriginalHireDate", type date}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each [SalaryLevel] = "High")
in
    #"Filtered Rows"

query:

SELECT
      EmployeeID,
      FirstName,
      LastName,
      JobTitle,
      HireDate,
      Department,
      Salary,
      CASE 
          WHEN Salary > 50000 THEN 'High'
          ELSE 'Low' 
      END AS SalaryLevel,
      CASE 
          WHEN JobTitle = 'Manager' THEN 'High'
          ELSE 'Medium' 
      END AS HierarchyLevel
FROM
      Employees
WHERE
      Department = 'Sales'
      AND Salary > 40000
      AND YEAR(HireDate) >= 2015
      AND FirstName LIKE '%John%'
ORDER BY
      Salary DESC

This query is used at the beginning of the M code.

What I am referring to is a feature that searches for and returns only the queries used in the dataset.

I’m currently using this amazing library to generate documentation for my PBIX files (once again, congratulations on the great work, it’s incredibly useful!). It would be awesome to have a dedicated tab just for the queries or data sources.

@Hugoberry
Copy link
Owner

Interesting idea. I know the library doesn't capture all of the m code via model.power_query. However, when parsing your pbix files, you get the m code, then a bit of regex should get you a long way. I will keep this issue open for revisiting when I touch the power_query code.

import re

def extract_query_from_powerquery(powerquery_text: str) -> str:
    # Find the Query parameter using regex
    query_match = re.search(r'\[Query="(.*?)"\]', powerquery_text, re.DOTALL)
    
    if not query_match:
        return ""
    
    query = query_match.group(1)
    
    # Replace Power Query special characters
    replacements = {
        "#(lf)": "\n",    # linefeed
        "#(tab)": "\t"   # tab
    }
    
    # Perform the replacements
    for old, new in replacements.items():
        query = query.replace(old, new)
    
    return query.strip()

courtesy of LLMs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants