Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPADE: hackathon changes #286

Merged
merged 63 commits into from
Jan 30, 2020
Merged
Changes from 3 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
5f17eae
modified multiple testing function
stellalessandra Dec 10, 2019
0f93151
removed fdr and holm bonferroni functions
stellalessandra Dec 10, 2019
7b305d4
added documentation
stellalessandra Dec 11, 2019
0935781
changed type variable tests to list
stellalessandra Dec 11, 2019
47b703a
added todo
stellalessandra Jan 7, 2020
656d596
Merge branch 'master' into feature/spade_multiple_testing
stellalessandra Jan 7, 2020
7fde1a1
added tolerance for non tested pvalues
stellalessandra Jan 7, 2020
2bf8330
Changed to numpy-version of pvalue-spectrum
Jan 13, 2020
66b84a6
From future import division
Jan 13, 2020
260d415
From future import division
Jan 13, 2020
0ac0baf
delete unused benchmarks
Jan 14, 2020
364d926
Merge branch 'feature/spade_multiple_testing' into feature/spade_hack…
stellalessandra Jan 14, 2020
784cbf8
Merge branch 'feature/spade_hackathon' of github.com:INM-6/elephant i…
stellalessandra Jan 14, 2020
be84009
added modifications to statistical testing function by masking pvalue…
stellalessandra Jan 14, 2020
61fa37c
fixed code duplication in _get_pvalue_spec() and _get_max_occ() (#71)
dizcza Jan 14, 2020
d82666a
added statsmodels as requirements and fixed bug
stellalessandra Jan 14, 2020
37e135d
Merge branch 'feature/spade_hackathon' of github.com:INM-6/elephant i…
stellalessandra Jan 14, 2020
0138d68
Some changes in documentations
Jan 14, 2020
865bcfd
changed requirements import
Jan 14, 2020
7b7adaa
stupid string error
Jan 14, 2020
d749f5e
Finished pvalue_spectrum function in numpy version
Jan 14, 2020
677ef00
Indexed in mask.nonzero
Jan 14, 2020
74481af
removed only windows with first spike flag
stellalessandra Jan 14, 2020
99cd8fc
fixed wrong ascii-characters
Jan 14, 2020
9e974d3
Merge branch 'feature/spade_hackathon' of github.com:INM-6/elephant i…
Jan 14, 2020
02d1c67
changed spade in initializer so that extra-req are needed
Jan 14, 2020
b5bd11b
try it with statsmodels in the std. req.
Jan 14, 2020
2e77820
changed import fim
Jan 15, 2020
76c5162
changed name of pv_spec in test spade
Jan 15, 2020
b6f8301
change to numpy array also for Fast FCA implementation
Jan 15, 2020
504908c
Added Joint-ISI dithering
Jan 15, 2020
9de3b47
added unit test for joint-isi dithering
Jan 15, 2020
b9da182
Performance improvements
Kleinjohann Jan 15, 2020
8ac354b
Allow to use all surrogates methods of spike_train_surrogates
Jan 15, 2020
d803ece
Merge branch 'feature/spade_hackathon' of github.com:INM-6/elephant i…
Kleinjohann Jan 15, 2020
3fe9904
added Alessandras paper in reference
Jan 15, 2020
e7537ff
added documentation in spade function and in masking pvalue spectrum …
stellalessandra Jan 15, 2020
d0f095b
Deleted non ASCII-symbols
Jan 15, 2020
f26818e
fixed conflict
stellalessandra Jan 15, 2020
418f5de
updated documentation
stellalessandra Jan 15, 2020
2406a78
Performance improvements
Kleinjohann Jan 16, 2020
448d70b
Merge branch 'feature/spade_hackathon' of github.com:INM-6/elephant i…
Kleinjohann Jan 16, 2020
2b2e2e0
Refactor pattern_set_reduction
Kleinjohann Jan 16, 2020
61315c4
fixed psr and little bugs
stellalessandra Jan 17, 2020
9f5f1ab
Fix subset/superset confusion and code duplication
Kleinjohann Jan 17, 2020
8d9b180
Move approximate_stability parameters into a dict
Kleinjohann Jan 17, 2020
08490eb
Little changes
Jan 27, 2020
b896ee6
Merge branch 'master' into feature/spade_hackathon
Jan 27, 2020
f5d6dbc
Included defaultdict
Jan 27, 2020
486834e
refactored stability calculation
Jan 27, 2020
5f0d5fc
Small bug in comments
Jan 27, 2020
0d8b50d
Replaced FIM of Spade_src
Jan 27, 2020
832ee7e
restructured random subsets for stability
Jan 28, 2020
7171bfd
moved statsmodels to requirements-extra; refactored docstring format
dizcza Jan 28, 2020
ffaa0fb
refactoring leftovers
dizcza Jan 28, 2020
159cb2d
travis: check HAVE_FIM only for requirements-extra
dizcza Jan 28, 2020
d039770
Revert "travis: check HAVE_FIM only for requirements-extra"
dizcza Jan 28, 2020
d3394dd
import statsmodels only in test_signature_significance()
dizcza Jan 28, 2020
eb11bb4
ModuleNotFoundError -> ImportError
dizcza Jan 28, 2020
82191fa
changed stability docs
Jan 29, 2020
f2b229a
finished error function
Jan 29, 2020
464c47b
check all input in main function
Jan 30, 2020
2e19f54
Improvement of docstrings
Jan 30, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
227 changes: 164 additions & 63 deletions elephant/spade.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@
def spade(spiketrains, binsize, winlen, min_spikes=2, min_occ=2,
max_spikes=None, max_occ=None, min_neu=1, approx_stab_pars=None,
n_surr=0, dither=15 * pq.ms, spectrum='#',
alpha=1, stat_corr='fdr_bh', surr_method='dither_spikes',
alpha=None, stat_corr='fdr_bh', surr_method='dither_spikes',
psr_param=None, output_format='patterns'):
r"""
Perform the SPADE [1,2] analysis for the parallel spike trains given in the
Expand Down Expand Up @@ -186,12 +186,12 @@ def spade(spiketrains, binsize, winlen, min_spikes=2, min_occ=2,
(number of spikes, number of occurrence, difference between last
and first spike of the pattern)
Default: '#'
alpha: float
The significance level of the hypothesis tests performed. If
`alpha == 1`, all the concepts are returned. If `0<alpha<1`, the
concepts are filtered according to their signature in the p-value
spectrum.
Default: 1
alpha: float, optional
The significance level of the hypothesis tests performed.
If `alpha is None`, all the concepts are returned.
If `0.<alpha<1.`, the concepts are filtered according to their
signature in the p-value spectrum.
Default: None
stat_corr: str
Method used for testing and adjustment of pvalues.
Can be either the full name or initial letters.
Expand Down Expand Up @@ -257,9 +257,15 @@ def spade(spiketrains, binsize, winlen, min_spikes=2, min_occ=2,
If n_surr==0 the p-values are set to 0.0.

If `output_format` is 'concepts', then `output['patterns']` is a
tuple of patterns which in turn are tuples of (spikes in the
pattern, occurrences of the pattern). For details see
:func:`concepts_mining`.
tuple of patterns which in turn are tuples of
1. element spikes in the pattern, occurrences of the pattern).
2. element: occurrences of the pattern
For details see :func:`concepts_mining`.
if stability is calculated
3. element: intensional stability
4. element: extensional stability
For details see :func:`approximate_stability`.
pbouss marked this conversation as resolved.
Show resolved Hide resolved


'pvalue_spectrum' (only if `n_surr > 0` and `n_subsets == 0`):
A list of signatures in tuples format
Expand All @@ -269,28 +275,6 @@ def spade(spiketrains, binsize, winlen, min_spikes=2, min_occ=2,
'non_sgnf_sgnt': list
Non significant signatures of 'pvalue_spectrum'.

if n_subsets > 0:
(spikes in the pattern, occurrences of the patterns,
(intensional stability, extensional stability))
corresponding pvalue
The patterns are filtered depending on the parameters in input:
If stability_thresh==None and alpha==1:
output['patterns'] contains all the candidates patterns
(all concepts mined with the fca algorithm)
If stability_thresh!=None and alpha==1:
output contains only patterns candidates with:
intensional stability>stability_thresh[0] or
extensional stability>stability_thresh[1]
If stability_thresh==None and alpha!=1:
output contains only pattern candidates with a signature
significant in respect the significance level alpha corrected
If stability_thresh!=None and alpha!=1:
output['patterns'] contains only pattern candidates with a
signature significant in respect the significance level alpha
corrected and such that:
intensional stability>stability_thresh[0] or
extensional stability>stability_thresh[1]

Notes
-----
If detected, this function will use MPI to parallelize the analysis.
Expand Down Expand Up @@ -326,20 +310,14 @@ def spade(spiketrains, binsize, winlen, min_spikes=2, min_occ=2,
else:
rank = 0

if output_format not in ['concepts', 'patterns']:
raise ValueError("The output_format value has to be"
"'patterns' or 'concepts'")
if surr_method not in surr.SURR_METHODS:
raise ValueError(
'specified surr_method (=%s) not valid' % surr_method)
compute_stability = False
if isinstance(approx_stab_pars, dict):
try:
compute_stability = approx_stab_pars['n_subsets'] > 0
except KeyError:
raise ValueError(
'for approximate stability computation you need to '
'pass n_subsets')
compute_stability = _check_input(
spiketrains=spiketrains, binsize=binsize, winlen=winlen,
min_spikes=min_spikes, min_occ=min_occ,
max_spikes=max_spikes, max_occ=max_occ, min_neu=min_neu,
approx_stab_pars=approx_stab_pars,
n_surr=n_surr, dither=dither, spectrum=spectrum,
alpha=alpha, stat_corr=stat_corr, surr_method=surr_method,
psr_param=psr_param, output_format=output_format)

time_mining = time.time()
if rank == 0 or compute_stability:
Expand Down Expand Up @@ -384,9 +362,6 @@ def spade(spiketrains, binsize, winlen, min_spikes=2, min_occ=2,
time_pvalue_spectrum))
# Storing pvalue spectrum
output['pvalue_spectrum'] = pv_spec
elif 0 < alpha < 1:
warnings.warn('0<alpha<1 but p-value spectrum has not been '
'computed (n_surr==0)')

# rank!=0 returning None
if rank != 0:
Expand All @@ -397,7 +372,7 @@ def spade(spiketrains, binsize, winlen, min_spikes=2, min_occ=2,
ns_signatures = []
# Decide whether filter concepts with psf
if n_surr > 0:
if len(pv_spec) > 0:
if len(pv_spec) > 0 and alpha is not None:
# Computing non-significant entries of the spectrum applying
# the statistical correction
ns_signatures = test_signature_significance(
Expand Down Expand Up @@ -430,6 +405,124 @@ def spade(spiketrains, binsize, winlen, min_spikes=2, min_occ=2,
return output


def _check_input(
spiketrains, binsize, winlen, min_spikes=2, min_occ=2,
max_spikes=None, max_occ=None, min_neu=1, approx_stab_pars=None,
n_surr=0, dither=15 * pq.ms, spectrum='#',
alpha=None, stat_corr='fdr_bh', surr_method='dither_spikes',
psr_param=None, output_format='patterns'):
"""
Checks all input given to SPADE
Parameters
----------
see :`func`:`spade`

Returns
-------
compute_stability: bool
if the stability calculation is used
"""

# Check spiketrains
if not all([isinstance(elem, neo.SpikeTrain) for elem in spiketrains]):
raise TypeError(
'spiketrains must be a list of SpikeTrains')
# Check that all spiketrains have same t_start and same t_stop
if not all([spiketrain.t_start == spiketrains[0].t_start
for spiketrain in spiketrains]) or \
not all([spiketrain.t_stop == spiketrains[0].t_stop
for spiketrain in spiketrains]):
raise ValueError(
'All spiketrains must have the same t_start and t_stop')

# Check binsize
if not isinstance(binsize, pq.Quantity):
raise ValueError('binsize must be a pq.Quantity')

# Check winlen
if not isinstance(winlen, int):
raise ValueError('winlen must be an integer')

# Check min_spikes
if not isinstance(min_spikes, int):
raise ValueError('min_spikes must be an integer')

# Check min_occ
if not isinstance(min_occ, int):
raise ValueError('min_occ must be an integer')

# Check max_spikes
if not (isinstance(max_spikes, int) or max_spikes is None):
raise ValueError('max_spikes must be an integer or None')

# Check max_occ
if not (isinstance(max_occ, int) or max_occ is None):
raise ValueError('max_occ must be an integer or None')

# Check min_neu
if not isinstance(min_neu, int):
raise ValueError('min_neu must be an integer')

# Check approx_stab_pars
compute_stability = False
if isinstance(approx_stab_pars, dict):
if 'n_subsets' in approx_stab_pars.keys() or\
('epsilon' in approx_stab_pars.keys() and
'delta' in approx_stab_pars.keys()):
compute_stability = True
else:
raise ValueError(
'for approximate stability computation you need to '
'pass n_subsets or epsilon and delta.')

# Check n_surr
if not isinstance(n_surr, int):
raise ValueError('n_surr must be an integer')

# Check dither
if not isinstance(dither, pq.Quantity):
raise ValueError('dither must be a pq.Quantity')

# Check spectrum
if spectrum not in ('#', '3d#'):
raise ValueError("spectrum must be '#' or '3d#'")

# Check alpha
if isinstance(alpha, (int, float)):
# Check redundant use of alpha
if 0. < alpha < 1. and n_surr == 0:
warnings.warn('0.<alpha<1. but p-value spectrum has not been '
'computed (n_surr==0)')
elif alpha is not None:
raise ValueError('alpha must be an integer, a float or None')

# Check stat_corr:
if stat_corr not in \
['bonferroni', 'sidak', 'holm-sidak', 'holm',
'simes-hochberg', 'hommel', 'fdr_bh', 'fdr_by',
'fdr_tsbh', 'fdr_tsbky', '', 'no']:
raise ValueError("Parameter stat_corr not recognized")

# Check surr_method
if surr_method not in surr.SURR_METHODS:
raise ValueError(
'specified surr_method (=%s) not valid' % surr_method)

# Check psr_param
if psr_param is not None:
if not isinstance(psr_param, list):
raise ValueError('psr_param must be None or a list of integer')
if not all(isinstance(param, int) for param in psr_param):
raise ValueError('elements of psr_param must be integers')

# Check output_format
if output_format not in ('concepts', 'patterns'):
raise ValueError("The output_format value has to be"
"'patterns' or 'concepts'")

return compute_stability


def concepts_mining(spiketrains, binsize, winlen, min_spikes=2, min_occ=2,
max_spikes=None, max_occ=None, min_neu=1, report='a'):
"""
Expand Down Expand Up @@ -1332,10 +1425,12 @@ def _get_max_occ(surr_concepts, min_spikes, max_spikes, winlen, spectrum):
return max_occ


def _stability_filter(concept, stab_thr):
def _stability_filter(concept, stability_thresh):
"""Criteria by which to filter concepts from the lattice"""
# stabilities larger then min_st
keep_concept = concept[2] > stab_thr[0] or concept[3] > stab_thr[1]
# stabilities larger then stability_thresh
keep_concept = \
concept[2] > stability_thresh[0]\
or concept[3] > stability_thresh[1]
return keep_concept


Expand Down Expand Up @@ -1539,7 +1634,8 @@ def _pattern_spectrum_filter(concept, ns_signatures, spectrum, winlen):
return keep_concept


def approximate_stability(concepts, rel_matrix, n_subsets, delta=0, epsilon=0):
def approximate_stability(concepts, rel_matrix, n_subsets=0,
delta=0., epsilon=0.):
r"""
Approximate the stability of concepts. Uses the algorithm described
in Babin, Kuznetsov (2012): Approximating Concept Stability
Expand All @@ -1561,7 +1657,7 @@ def approximate_stability(concepts, rel_matrix, n_subsets, delta=0, epsilon=0):
no spikes or 1 if one or more spikes occurred in that bin for that
particular neuron. For example, the entry [0,0] of this matrix
corresponds to the first bin of the first window position for the first
neuron, the entry `[0,winlen]` to the first bin of the first window
neuron, the entry `[0, winlen]` to the first bin of the first window
position for the second neuron.
n_subsets: int
Number of subsets of a concept used to approximate its stability. If
Expand All @@ -1572,13 +1668,13 @@ def approximate_stability(concepts, rel_matrix, n_subsets, delta=0, epsilon=0):

.. math::
n_subset = frac{1}{2\eps^2} \ln(frac{2}{\delta}) +1

delta: float, optrional
delta: probability with at least :math:`1-\delta`
Default: 0
delta: float, optional
delta: probability with at least :math:`1-\delta`
Default: 0.
epsilon: float, optional
epsilon: absolute error
Default: 0
Default: 0.

Returns
-------
Expand Down Expand Up @@ -1609,8 +1705,13 @@ def approximate_stability(concepts, rel_matrix, n_subsets, delta=0, epsilon=0):
else:
rank = 0
size = 1
if n_subsets <= 0 and delta + epsilon <= 0:
raise ValueError('n_subsets has to be >=0 or delta + epsilon > 0')
if not (isinstance(n_subsets, int) and n_subsets >= 0):
raise ValueError('n_subsets must be an integer >=0')
if n_subsets == 0 and not (isinstance(delta, float) and delta > 0. and
isinstance(epsilon, float) and epsilon > 0.):
raise ValueError('delta and epsilon must be floats > 0., '
'given that n_subsets = 0')

if len(concepts) == 0:
return []
if len(concepts) <= size:
Expand All @@ -1620,8 +1721,8 @@ def approximate_stability(concepts, rel_matrix, n_subsets, delta=0, epsilon=0):
range(0, len(concepts) - len(concepts) % size + 1,
len(concepts) // size)) + [len(concepts)]
# Calculate optimal n
if delta + epsilon > 0 and n_subsets == 0:
n_subsets = np.log(2. / delta) / (2 * epsilon ** 2) + 1
if n_subsets == 0:
n_subsets = int(round(np.log(2. / delta) / (2 * epsilon ** 2) + 1))

if rank == 0:
concepts_on_partition = concepts[rank_idx[rank]:rank_idx[rank + 1]] + \
Expand Down