Skip to content

This repository implements the ngram based algorithms described in "Real-World Trajectory Sharing with Local Differential Privacy"

Notifications You must be signed in to change notification settings

0hex7/ngram-implementation-old

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

n-Gram Algorithm Implementation

Overview

This repository focuses on the implementation of the n-gram algorithm, an essential technique in natural language processing (NLP) and computational linguistics. The n-gram model predicts the occurrence of a word based on the previous 'n' words in a sequence of text, often used in various text-related applications such as language modeling, speech recognition, and spelling correction.

Source Paper: Real-World Trajectory Sharing with Local Differential Privacy

The primary reference paper for the n-gram algorithm implementation is titled "Real-World Trajectory Sharing with Local Differential Privacy" authored by Teddy Cunningham, Graham Cormode, Hakan Ferhatosmanoglu, and Divesh Srivastava. Published in the Proceedings of the VLDB Endowment (PVLDB) in 2021 (Volume 14, Issue 11), the paper introduces a local differentially private mechanism based on perturbing hierarchically-structured, overlapping n-grams of trajectory data. The DOI for the paper is 10.14778/3476249.3476280.

Introduction: Purpose of the Repository

This repository serves as a comprehensive guide for implementing the n-gram algorithm, a fundamental approach in text analysis and processing. The n-gram model, known for its simplicity and effectiveness in capturing sequential patterns in text, is implemented here to aid in understanding and applying this technique in various NLP-related tasks.

File Structure: Organized Repository Architecture

The repository structure is organized as follows:

  • code: Contains the primary implementation of the n-gram algorithm.

    • util: Stores utility functions essential for the algorithm implementation.
  • data: Manages datasets for experimentation and evaluation.

    • preprocessed: Stores preprocessed text data in preparation for n-gram analysis.
    • results: Holds the output files generated during the algorithm execution.
    • notebooks: Contains Jupyter notebooks demonstrating the n-gram algorithm in action.

Requirements: Necessary Dependencies for Execution

Ensure all prerequisites listed in the 'requirements.txt' file are installed before runing the implementation.

Execution: Running the n-Gram Algorithm

Execute the provided scripts or Jupyter notebooks to run the n-gram algorithm on the given datasets. Follow the instructions provided in the respective files for detailed guidance.

Acknowledgments: Recognition and Support

We extend our gratitude to the authors of the referenced paper. The code implementation in this repository draws inspiration from their groundbreaking work.

Contributors: Collaborative Efforts Behind the Implementation

Feel free to modify or expand any section to suit your specific requirements.

About

This repository implements the ngram based algorithms described in "Real-World Trajectory Sharing with Local Differential Privacy"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages