The official documentation with usage is at https://graspologic.readthedocs.io/en/latest/
Please visit the tutorial section in the official website for more in depth usage.
graspologic
package requires only a standard computer with enough RAM to support the in-memory operations.
graspologic
is tested on the following OSes:
- Linux x64
- macOS x64
- Windows 10 x64
And across the following versions of Python:
- 3.6 (x64)
- 3.7 (x64)
- 3.8 (x64)
pip install graspologic
git clone https://github.com/microsoft/graspologic
cd graspologic
python3 -m venv venv
source venv/bin/activate
python3 setup.py install
The algorithms are located in microsoft/graspologic/graspologic/cluster/.
To run the R scripts, you will need to install R and the mclust library (we use version 5.4.2 in the paper). We recommend the RStudio IDE https://www.rstudio.com/. Users may need to "Set Working Directory" to "Source File Location," for the scripts to find find relative paths correctly.
These files reproduce Table 2, Figures 1-3, and Figure 5. They run the clustering algorithms on the complete datasets. Instructions within.
These files reproduce Figure 4. They run the clustering algorithms on the subsets of the data. Instructions within.
These files reproduce Figure 6. Instructions within.
These files reproduce Figure 7. They run and compare various clustering algorithms on the double-cigar dataset. Instructions within.
These files reproduce Figures 8-9. They run the hierarchical clustering algorithm on simulated and real datasets. Instructions within.
brute_cluser_graspyclust.py - implementation of graspyclust
make_gmix.py - script that was used to make data/synthetic.csv
contains the datasets that were used in the paper
We welcome contributions from anyone. Please see our contribution guidelines before making a pull request. Our issues page is full of places we could use help! If you have an idea for an improvement not listed there, please make an issue first so you can discuss with the developers.
This project is covered under the MIT License.
We appreciate detailed bug reports and feature requests (though we appreciate pull requests even more!). Please visit our issues page if you have questions or ideas.
If you find AutoGMM
useful in your work, please cite the algorithm via the AutoGMM paper
Athey, T. L., Liu, T., Pedigo, B. D., & Vogelstein, J. T. (2021). AutoGMM: Automatic and Hierarchical Gaussian Mixture Modeling in Python. arXiv preprint arXiv:1909.02688.