-
Git Clone the Repository:
- Navigate to any directory where you would like to keep the project and execute the following command to clone the repository:
git clone https://github.com/rootcodelabs/rootcode-anonymizer.git
- Navigate to any directory where you would like to keep the project and execute the following command to clone the repository:
-
Setup the Base Model:
- Before proceeding with the installation, please download the NER_basemodel.zip from this link and place it in the root directory of the cloned repository. Then, unzip the file. This zip file contains the model for Named Entity Recognition (NER). Note that without this base model, the anonymizer will not work.
-
Installation Methods:
- There are two ways to install:
- Using Python
- Using Docker
- There are two ways to install:
-
Create Python Environment:
- First, create an environment with Python version 3.9.19.
-
Install Requirements:
- Install the required packages specified in the
requirements.txt
file using the following command:pip install --no-cache-dir -r requirements.txt
- Install the required packages specified in the
-
Launch the Application:
- Launch the application using the following command:
streamlit run anonymizer_application.py
- In most cases, this will open a browser window directing you to the application. Alternatively, you can visit http://localhost:8501 to access the application.
- Launch the application using the following command:
-
Install Docker:
- Ensure Docker is installed on your computer. If not, please install Docker first.
-
Build Docker Container:
- Navigate to the cloned repository and execute the following command to build the Docker container:
docker build -t anonymizer-app .
- This process may take a few minutes.
- Navigate to the cloned repository and execute the following command to build the Docker container:
-
Start the Container:
- Once the build process is completed, start the container using the following command:
docker run -p 8501:8501 anonymizer-app
- You can visit http://localhost:8501 to access the application.
- Once the build process is completed, start the container using the following command:
-
Stopping the Container:
- When you need to stop the container, use the following command:
docker stop anonymizer-app
- When you need to stop the container, use the following command:
This pipeline provides an effective solution for anonymizing data. Enjoy anonymizing your data seamlessly!
Please Click Here to view anonymizer demo video
Sometimes when you access http://localhost:8501, you might encounter a loading screen, as shown below:
After loading, you will be presented with the main screen, where you'll find three options in the sidebar:
- Anonymizer
- Regex
- Immutable Words
This component allows you to anonymize text. Simply click on "Browse files" to select a .csv file, or drag and drop a csv file onto the interface.
Once the file is provided, Anonymizer will start processing it like this:
This might take a lot of time depending on the csv file provided and system requirements. After processing, the system will allow you to download the anonymized file:
Anonymizer can detect given regex patterns and replace them with a specified replacement string. Here you can view, update, and delete existing regex patterns and replacements, as well as add completely new ones.
If there are certain words that Anonymizer should not anonymize, you can manage those words here. You can view, update, and delete existing words, as well as add new ones.
Important Points:
- When anonymization is happening on the Anonymizer tab, do not change tabs to Regex or Immutable Words, as this will cause you to lose results of the anonymization.