You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The summarize function depends on the text having at least 10 sentences as measured by clean_text_by_sentences. If the text is shorter than that then the summarization fails in an undocumented way. Moreover clean_text_by_sentences cannot handle properly a text with new lines
at the middle of a sentence. I suggest a preprocessing step to purge those.
I'm currently using this to workaround this bug
import re
text = re.sub(r'\n|\r|\t', ' ', text)
text = re.sub(r'\s+', ' ', text)
Description
The
summarize
function depends on the text having at least 10 sentences as measured byclean_text_by_sentences
. If the text is shorter than that then the summarization fails in an undocumented way. Moreoverclean_text_by_sentences
cannot handle properly a text with new linesat the middle of a sentence. I suggest a preprocessing step to purge those.
I'm currently using this to workaround this bug
Versions
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)]
The text was updated successfully, but these errors were encountered: