Authors: Niccolò Sacchi, Valentin Nigolian, Antonio Barbera
The data was produced by a physical simulator imitating the results of an experiment conducted at CERN in Geneva. It contains 250'000 samples, each with 30 features. Each item represents the result of two particles crashing into each other in CERN's Large Hadron Collider and each features represents one particular of said crash.
This project aims at exploring the data and training a model in order to predict whether a given event’s signature was the result of a Higgs boson (signal) or some other process/particle (background). No machine learning libraries have been used.
The model can be trained by downloading the dataset and running the python script scripts/run.py
. Alternatively, the Jupyter notebook called project_solution.ipynb
to get the intermediary results and visualizations.
documets/
: folder containing the description of the project (project_description.pdf
), the report (report.pdf
) and the dataset documentation (higgs_doc.pdf
).scripts/
: folder containing the python scripts used to load, clean plot the dataset and train the model.project_solution.ipynb
: notebook containing the whole process, from the exploratory analysis to the training of the model.