Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change points reported different from MOA - Method _detect_change #1614

Closed
denisesato opened this issue Sep 11, 2024 · 0 comments
Closed

Change points reported different from MOA - Method _detect_change #1614

denisesato opened this issue Sep 11, 2024 · 0 comments

Comments

@denisesato
Copy link
Contributor

Versions

river version: 0.21.2
Python version: 3.12
Operating system: Windows 11

Describe the bug

Steps/code to reproduce

# Sample code to reproduce the problem
# Please do your best to provide a Minimal, Reproducible Example: https://stackoverflow.com/help/minimal-reproducible-example

from river import drift
import pandas as pd

def read_attribute(complete_filename, attribute):
    print(f'Reading file {complete_filename}...')
    df = pd.read_csv(complete_filename)
    durations = df[attribute].tolist()
    return durations

def analyze_changes(attribute_series, delta):
    detector = drift.ADWIN(delta=delta)
    num_instances = 0
    changes = []
    # normalized_durations = normalize([durations], norm="max")[0].tolist()
    # for d in normalized_durations:
    for value in attribute_series:
        num_instances += 1
        detector.update(value)
        print(f'ADWIN variance ({num_instances}): {detector.variance} - {detector.estimation} - {detector.width}')
        if detector.drift_detected:
            changes.append(num_instances)
            print(
                f'****** Change detected in data: {value} - at instance: {num_instances}')
    print(f'---- Changes detected at: {changes} -----')
    return changes


if __name__ == '__main__':
    complete_filename = 'data/input/ARFFLogArtificial01P300C10A.csv'
    series = read_attribute(complete_filename, 'Duration')
    drifts = analyze_changes(attribute_series=series, delta=0.05)
# Sample code to reproduce in MOA
import com.github.javacliparser.FloatOption;
import moa.classifiers.core.driftdetection.ADWINChangeDetector;
import moa.core.InstanceExample;
import moa.streams.ArffFileStream;

import java.util.ArrayList;
import java.util.List;

public class TestAdwin {

    public static void main(String[] args) {
        String filename = "data/input/ARFFLogArtificial01P300C10A.arff";
        ADWINChangeDetector detector = new ADWINChangeDetector();
        detector.deltaAdwinOption = new FloatOption("deltaAdwin", 'a',
                "Delta of Adwin change detection", 0.05, 0.0, 1.0);

        ArffFileStream reader = new ArffFileStream();
        reader.arffFileOption.setValue(filename);

        // salva as alterações para cada detector
        List<Integer> changes = new ArrayList<Integer>();
        System.out.println("--- Reading: " + filename);
        reader.prepareForUse();
        reader.restart();
        int qtdInstancias = 0;

        detector.prepareForUse();
        detector.resetLearning();

        while (reader.hasMoreInstances()) {
            qtdInstancias++;
            InstanceExample instanceExample = reader.nextInstance();
            double val = instanceExample.getData().value(0);
            detector.input(val);

            System.out.println("ADWIN variance (" + qtdInstancias + "): " + detector.getEstimation());

            // verifica se detectou mudanca
            if (detector.getChange()) {
                System.out.println("--- Change detected at: " + qtdInstancias);
                changes.add(qtdInstancias);
                detector.resetLearning();
            }
        }
        System.out.println("--- Changes at: " + changes);
    }

}

ARFFLogArtificial01P300C10A.csv

denisesato added a commit to denisesato/river that referenced this issue Sep 11, 2024
smastelini pushed a commit that referenced this issue Sep 11, 2024
* Bugfix based on MOA source code in the _detect_change method. The k index should increment from 0 to bucket.current_idx - 1. The previous code "for k in range(bucket.current_idx - 1):" only increment k to bucket.current_idx - 2 because of the range function.

* Bugfix based on MOA source code in the _detect_change method. The k index should increment from 0 to bucket.current_idx - 1. The previous code "for k in range(bucket.current_idx - 1):" only increment k to bucket.current_idx - 2 because of the range function.

* Tests updated due to bugfix on the ADWIN change detector - reported on issue #1614
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants