You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pretty sure this is user error. My dataframe contains a large block of text (from a SQL database) as the column contentraw. When I try to pass back the top sentence, I get a mangled mess instead. The desired output is the single top sentence in the document.
What am I doing wrong?
Code:
df <- data.table(dbxSelect(dbxcon, selectarticles))
cleancopy <- function(x, urls = TRUE, hashtags = TRUE)
{
## remove obvious crap
if (urls) {
x = gsub("\\s?(f|ht)(tp)(s?)(://)([^\\.]*)[\\.|/](\\S*)", "", x)
}
if (hashtags) {
x = gsub("#\\S+", "", x)
}
## split sentences to new lines
x = gsub("\\. ", "\\. \n", x)
#return
x
}
## clean up the column
df$contentraw <- cleancopy(df$contentraw)
## run rank and assign to key
df$keysent <- df[, lexRankr::lexRank(
contentraw,
docId = url,
n = 1,
continuous = TRUE,
returnTies = FALSE
),
by = url]
The text was updated successfully, but these errors were encountered:
Pretty sure this is user error. My dataframe contains a large block of text (from a SQL database) as the column contentraw. When I try to pass back the top sentence, I get a mangled mess instead. The desired output is the single top sentence in the document.
What am I doing wrong?
Code:
The text was updated successfully, but these errors were encountered: