-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Purr #8
Comments
Restating the problem to check my understanding: Assuming you want to stick to the
|
Thank you, this is showing me a way I think works like yours. Here is how I have adapted it. I just want to make sure I am getting the results I should be getting and the ranking is what it should be. This is how I see a tidy verse workflow guided by your example, hopefully, the results are valid:
|
It depends on what you are trying to accomplish. Your code will give you highest lexRanked sentences per doc, but the lexRank itself is being calculated with respect to the whole corpus. So the sentences returned are not the most representative sentences per-document, but rather the most representative sentences of the corpus. If that is what you are trying to accomplish then your solution works. The code I provided will return the highest lexRanked sentences per document. If this is your goal you will also benefit from a performance boost (since executing lexRank on a full corpus can be computationally expensive). Below is an extension of the gist I posted using
Additionally, in your code you call |
I am trying to get the highest ranked sentences per document. I will go back to your code. I wasn't able to make it work at first and the split() line was confusing me. |
I'll look into finding the best way to add this functionality to the package so the process is simpler. |
I am still trying to adapt the get_top_sentences function as it has not been working for me. Thank you so much for your insights! |
Your full script runs for me if I make the If the fix mentioned doesn't resolve your issue please post your error and we can work through it. |
Ok, I do get an error but I think I am not using your modification correctly.
So, I am assuming the start is after gm_unnest ?
From there, is the function get_top_sentences also modified in some way? then you run your modification? |
I combined your code and a possible solution using the The script runs into 2 errors during the lexRank process (so we will have 2 documents missing from our results), but the |
This is working nicely! Thank you, this is great! If you do look into simplifying this process, I would love to test it! |
Is there a way to use map() in a pipe with lexrank? Lets say I want to extract a summary sentence from documents collected in a data frame, one article per row.
I guess you would have to unnest_sentences for each row, then create a new table to store the top ranking sentences?
The text was updated successfully, but these errors were encountered: