Given a large text corpus find the longest common substring that is repeated most often

The sample corups I was working on inth R file was analyzing Swiss Supreme court rulings which frequently copy exact paragraphs from past decisions. Goal was t to understand if law is getting more complex over time.

OpenData can be found here : https://www.bger.ch/ext/eurospider/live/de/php/clir/http/index_atf.php?lang=de

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Files

README.md

Latest commit

History

README.md

File metadata and controls