Skip to content

Commit

Permalink
Merge pull request #35 from iomega/refactor_environment
Browse files Browse the repository at this point in the history
Refactor to work with matchms 0.6.0
  • Loading branch information
florian-huber authored Sep 16, 2020
2 parents 7f2cc8c + bbac633 commit f2a7253
Show file tree
Hide file tree
Showing 15 changed files with 141 additions and 210 deletions.
24 changes: 18 additions & 6 deletions .github/workflows/conda_build.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
name: Anaconda Build

on: [push, pull_request]
on:
push:
pull_request:
types: [opened, reopened]

jobs:

Expand All @@ -11,7 +14,10 @@ jobs:
fail-fast: false
matrix:
os: ['ubuntu-latest', 'macos-latest', 'windows-latest']
python-version: ["3.7"]
python-version: ["3.8"]
include:
- os: 'ubuntu-latest'
python-version: "3.7"
steps:
- uses: actions/checkout@v2
with:
Expand Down Expand Up @@ -73,7 +79,7 @@ jobs:
if: matrix.os == 'ubuntu-latest'
run: sed -i "s+$PWD/++g" coverage.xml
- name: SonarCloud Scan
if: matrix.os == 'ubuntu-latest'
if: ${{ matrix.os == 'ubuntu-latest' && matrix.python-version == '3.8' }}
uses: sonarsource/sonarcloud-github-action@master
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Expand Down Expand Up @@ -123,7 +129,7 @@ jobs:
[ "$RUNNING_OS" = "Windows" ] && export BUILDDIR=$RUNNER_TEMP\\spec2vec\\_build\\
conda config --set anaconda_upload no
conda build --numpy 1.18.1 --no-include-recipe \
--channel bioconda --channel conda-forge \
--channel nlesc --channel bioconda --channel conda-forge \
--croot ${BUILDDIR} \
./conda
- name: Upload package artifact from build
Expand All @@ -139,7 +145,7 @@ jobs:
fail-fast: false
matrix:
os: ['ubuntu-latest', 'macos-latest', 'windows-latest']
python-version: ['3.7']
python-version: ['3.7', '3.8']
runs-on: ${{ matrix.os }}
needs: build
steps:
Expand Down Expand Up @@ -194,7 +200,13 @@ jobs:
conda install \
--channel bioconda \
--channel conda-forge \
--channel $BUILDDIR \
--channel nlesc \
matchms
# Install spec2vec without nlesc channel to prevent package to be installed from nlesc channel instead of wanted $BUILDDIR channel
conda install \
--channel bioconda \
--channel conda-forge \
--channel $BUILDDIR -v \
spec2vec
- name: List conda packages
shell: bash -l {0}
Expand Down
4 changes: 3 additions & 1 deletion .github/workflows/conda_publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,9 @@ jobs:
export BUILD_FOLDER=/tmp/spec2vec/_build
mkdir -p $BUILD_FOLDER
conda build --numpy 1.18.1 --no-include-recipe \
--channel bioconda --channel conda-forge \
--channel nlesc \
--channel bioconda \
--channel conda-forge \
--croot $BUILD_FOLDER \
./conda
- name: Push the package to anaconda cloud
Expand Down
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added

- Support for Python 3.8 [#35](https://github.com/iomega/spec2vec/pull/35)

### Changed

- Refactored Spec2Vec class to provide .pair() and .matrix() methods [#35](https://github.com/iomega/spec2vec/pull/35)

### Removed

- Spec2VecParallel (is now included as Spec2Vec.matrix()) [#35](https://github.com/iomega/spec2vec/pull/35)

## [0.2.0] - 2020-06-18

### Added
Expand Down
4 changes: 2 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -99,14 +99,14 @@ Installation

Prerequisites:

- Python 3.7
- Python 3.7 or 3.8
- Anaconda

Install spec2vec from Anaconda Cloud with

.. code-block:: console
conda env create --name spec2vec python=3.7
conda env create --name spec2vec python=3.8
conda activate spec2vec
conda install --channel nlesc --channel bioconda --channel conda-forge spec2vec
Expand Down
2 changes: 1 addition & 1 deletion conda/environment-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ dependencies:
- anaconda-client
- conda-build
- conda-verify
- python >=3.7,<3.8
- python >=3.7,<3.9
4 changes: 2 additions & 2 deletions conda/environment-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ channels:
- nlesc
dependencies:
- gensim >=3.8.0
- matchms>=0.5.0
- matchms>=0.6.0
- numpy
- pip
- python >=3.7,<3.8
- python >=3.7,<3.9
- scipy
- pip:
- bump2version
Expand Down
3 changes: 2 additions & 1 deletion conda/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ channels:
- defaults
dependencies:
- gensim >=3.8.0
- matchms >=0.6.0
- numpy
- python >=3.7,<3.8
- python >=3.7,<3.9
- scipy
7 changes: 5 additions & 2 deletions conda/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ source:

extra:
channels:
- nlesc
- conda-forge
- bioconda

Expand All @@ -26,18 +27,20 @@ requirements:
- conda-verify
- pytest-runner
- python
- matchms >=0.6.0
- numpy {{ numpy }}
- setuptools
host:
- python >=3.7,<3.8
- python >=3.7,<3.9
- pip
- pytest-runner
- setuptools
run:
- gensim >=3.8.0
- matchms >=0.6.0
- numpy
- pip
- python >=3.7,<3.8
- python >=3.7,<3.9
- scipy

test:
Expand Down
86 changes: 0 additions & 86 deletions integration-tests/test_user_workflow_spec2vec_parallel.py

This file was deleted.

4 changes: 2 additions & 2 deletions readthedocs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,15 @@ Installation

Prerequisites:

- Python 3.7
- Python 3.7 or 3.8
- Anaconda

Install spec2vec from Anaconda Cloud with

.. code-block:: console
# install spec2vec in a new virtual environment to avoid dependency clashes
conda create --name spec2vec python=3.7
conda create --name spec2vec python=3.8
conda activate spec2vec
conda install --channel nlesc --channel bioconda --channel conda-forge spec2vec
Expand Down
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@
"License :: OSI Approved :: Apache Software License",
"Natural Language :: English",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.7"
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8"
],
test_suite="tests",
install_requires=[
Expand Down
42 changes: 40 additions & 2 deletions spec2vec/Spec2Vec.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
from typing import List
from typing import Union
import numpy
import scipy
from gensim.models.basemodel import BaseTopicModel
from matchms.similarity.BaseSimilarity import BaseSimilarity
from spec2vec.SpectrumDocument import SpectrumDocument
from .calc_vector import calc_vector


class Spec2Vec:
class Spec2Vec(BaseSimilarity):
"""Calculate spec2vec similarity scores between a reference and a query.
Using a trained model, spectrum documents will be converted into spectrum
Expand Down Expand Up @@ -64,7 +67,7 @@ def __init__(self, model: BaseTopicModel, intensity_weighting_power: Union[float
self.allowed_missing_percentage = allowed_missing_percentage
self.vector_size = model.wv.vector_size

def __call__(self, reference: SpectrumDocument, query: SpectrumDocument) -> float:
def pair(self, reference: SpectrumDocument, query: SpectrumDocument) -> float:
"""Calculate the spec2vec similaritiy between a reference and a query.
Parameters
Expand All @@ -86,3 +89,38 @@ def __call__(self, reference: SpectrumDocument, query: SpectrumDocument) -> floa
cdist = scipy.spatial.distance.cosine(reference_vector, query_vector)

return 1 - cdist

def matrix(self, references: List[SpectrumDocument], queries: List[SpectrumDocument],
is_symmetric: bool = False) -> numpy.ndarray:
"""Calculate the spec2vec similarities between all references and queries.
Parameters
----------
references:
Reference spectrum documents.
queries:
Query spectrum documents.
Returns
-------
spec2vec_similarity
Array of spec2vec similarity scores.
"""
n_rows = len(references)
reference_vectors = numpy.empty((n_rows, self.vector_size), dtype="float")
for index_reference, reference in enumerate(references):
reference_vectors[index_reference, 0:self.vector_size] = calc_vector(self.model,
reference,
self.intensity_weighting_power,
self.allowed_missing_percentage)
n_cols = len(queries)
query_vectors = numpy.empty((n_cols, self.vector_size), dtype="float")
for index_query, query in enumerate(queries):
query_vectors[index_query, 0:self.vector_size] = calc_vector(self.model,
query,
self.intensity_weighting_power,
self.allowed_missing_percentage)

spec2vec_similarity = 1 - scipy.spatial.distance.cdist(reference_vectors, query_vectors, "cosine")

return spec2vec_similarity
Loading

0 comments on commit f2a7253

Please sign in to comment.