Skip to content

Commit

Permalink
Update model based on a few false positives
Browse files Browse the repository at this point in the history
  • Loading branch information
ktos committed Dec 9, 2019
1 parent eefc445 commit 01c91fa
Show file tree
Hide file tree
Showing 6 changed files with 152 additions and 38 deletions.
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,12 +135,18 @@ export ELEIA_Logging__LogLevel__Default=Error
It is using ML.NET framework to perform machine learning-based binary classification,
and AutoML (`mlnet auto-train`) was used to generate model.

Exact run:

```batch
mlnet auto-train -T binary-classification -d trainingdata2.tsv -o ML --label-column-name code -x 180
```

[trainingdata2.tsv](https://github.com/ktos/Eleia/blob/master/trainingdata2.tsv)
was a file which the training was performed on.

[trainingdata2.log](https://github.com/ktos/Eleia/blob/master/trainingdata2.log)
is a log of auto-training, SdcaLogisticRegressionBinary was decided with
accuracy of 0.9585.
accuracy of 0.9572.

I was also running longer training sessions than 180 seconds, but still that
algorithm was decided as the best, longer time had no visible impact.
Expand Down
8 changes: 7 additions & 1 deletion README.pl.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,11 +138,17 @@ export ELEIA_Logging__LogLevel__Default=Error
Został wykonany ML.NET do wykorzystania modelu binarnej klasyfikacji, natomiast
AutoML (tj. `mlnet auto-train`) zostało wykorzystane do wygenerowania modelu.

Konkretnie:

```batch
mlnet auto-train -T binary-classification -d trainingdata2.tsv -o ML --label-column-name code -x 180
```

[trainingdata2.tsv](https://github.com/ktos/Eleia/blob/master/trainingdata2.tsv)
to plik, na którym model był trenowany.

[trainingdata2.log](https://github.com/ktos/Eleia/blob/master/trainingdata2.log)
to log z autotreningu, SdcaLogisticRegressionBinary z dokładnością 0,9585 został
to log z autotreningu, SdcaLogisticRegressionBinary z dokładnością 0,9572 został
ostatecznie wykorzystany.

Uruchamiałem też dłuższe sesje trenujące niż 180 sekund, ale nadal ten algorytm
Expand Down
Binary file modified src/MLModel.zip
Binary file not shown.
4 changes: 2 additions & 2 deletions test/PostAnalyzerTests.cs
Original file line number Diff line number Diff line change
Expand Up @@ -96,11 +96,11 @@ public void Analyze_PostWithLinksThreshold08_ProblemsFound()
}

[Fact]
public void Analyze_PostWithUnformattedCodeThreshold098_UnformattedCodeFound()
public void Analyze_PostWithUnformattedCodeThreshold097_UnformattedCodeFound()
{
var myConfiguration = new Dictionary<string, string>
{
{"threshold", "0.98"}
{"threshold", "0.97"}
};

var configuration = new ConfigurationBuilder()
Expand Down
Loading

0 comments on commit 01c91fa

Please sign in to comment.