This project employs a Random Forest model trained on code features to predict potential vulnerabilities in JavaScript, providing a foundational understanding of machine learning applications in code security.
- JSVulnerabilityDataSet-1.0.csv: Contains features extracted from JavaScript code and a binary label indicating the presence or absence of vulnerabilities.
- Data Source: The dataset is sourced from http://www.inf.u-szeged.hu/~ferenc/papers/JSVulnerabilityDataSet/
The KNIME workflow performs the following steps:
- Data Input: Reads the dataset.
- Preprocessing: Handles missing values and selects relevant features.
- Rule Engine: Creates a nominal column for stratified sampling.
- Partitioning: Splits the data into training and test sets using stratified sampling.
- Random Forest Learner: Trains a Random Forest model on the training data.
- Random Forest Predictor: Applies the model to the test data for predictions.
- Scorer: Evaluates the model's performance using metrics like accuracy, precision, recall, and F1-score.
- Download from KNIME Hub: The workflow can be downloaded directly from the KNIME Community Hub.
- View on KNIME Hub: You can also visualize the workflow online at https://hub.knime.com/-/spaces/-/~wWHTgRJIGutbxkD-/current-state/
- Local Execution:
- Install KNIME: Download and install KNIME Analytics Platform from the official website at https://www.knime.com/downloads.
- Import Workflow: Import the downloaded workflow file into your KNIME workspace.
- Configure Data Input: Connect the "Data Input" node to your "JSVulnerabilityDataSet-1.0.csv" file.
- Execute Workflow: Run the workflow to train the model and evaluate its performance.
- KNIME Analytics Platform
- KNIME Extension for Random Forest
This project is licensed under the MIT License.