Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readdlm slow #29036

Closed
mschauer opened this issue Sep 4, 2018 · 0 comments · Fixed by #29075
Closed

readdlm slow #29036

mschauer opened this issue Sep 4, 2018 · 0 comments · Fixed by #29075
Assignees
Labels
compiler:inference Type inference performance Must go faster regression Regression in behavior compared to a previous version

Comments

@mschauer
Copy link
Contributor

mschauer commented Sep 4, 2018

The deprecation warning for readcsv reads:

julia> @time dat = readcsv("./data/EURUSD-2015-03.csv")
┌ Warning: `readcsv(io; opts...)` is deprecated, use `readdlm(io, ','; opts...)` instead.

with additional info given by

ERROR: WARNING: Base.readdlm is deprecated: it has been moved to the standard library package `DelimitedFiles`.
Add `using DelimitedFiles` to your imports.

Now on 0.6 for a 5177037x4 Matrix{Any}

julia> @time dat = readcsv("./data/EURUSD-2015-03.csv")
 25.890791 seconds (83.21 M allocations: 2.709 GiB, 13.91% gc time)
5177037×4 Array{Any,2}
 "EUR/USD"  "20150302 00:00:00.237"  1.11603  1.11614
...

vs. 1.0

julia> using DelimitedFiles
julia> @time dat = readdlm("./data/EURUSD-2015-03.csv", ',')
318.996088 seconds (1.03 G allocations: 23.877 GiB, 6.93% gc time)
5177037×4 Array{Any,2}:
 "EUR/USD"  "20150302 00:00:00.237"  1.11603  1.11614
...
@JeffBezanson JeffBezanson added performance Must go faster regression Regression in behavior compared to a previous version labels Sep 4, 2018
@JeffBezanson JeffBezanson self-assigned this Sep 6, 2018
@JeffBezanson JeffBezanson added the compiler:inference Type inference label Sep 6, 2018
JeffBezanson added a commit that referenced this issue Sep 7, 2018
In this case, the result of `iterate` has not been checked for
`nothing`, so we try to call `indexed_iterate` (for destructuring
assignment) on a Union of Nothing and the tuple returned by
`iterate`. That has two method matches, and so was excluded from
constant propagation. This commit fixes that by generalizing the
constant prop heuristic from requiring one method match to
requiring one non-Bottom method match.

This issue caused a large slowdown in DelimitedFiles, where
the inner loop consists of

```
        while idx <= slen
            val,idx = iterate(dbuff, idx)
```
Keno pushed a commit that referenced this issue Sep 8, 2018
In this case, the result of `iterate` has not been checked for
`nothing`, so we try to call `indexed_iterate` (for destructuring
assignment) on a Union of Nothing and the tuple returned by
`iterate`. That has two method matches, and so was excluded from
constant propagation. This commit fixes that by generalizing the
constant prop heuristic from requiring one method match to
requiring one non-Bottom method match.

This issue caused a large slowdown in DelimitedFiles, where
the inner loop consists of

```
        while idx <= slen
            val,idx = iterate(dbuff, idx)
```
KristofferC pushed a commit that referenced this issue Sep 10, 2018
In this case, the result of `iterate` has not been checked for
`nothing`, so we try to call `indexed_iterate` (for destructuring
assignment) on a Union of Nothing and the tuple returned by
`iterate`. That has two method matches, and so was excluded from
constant propagation. This commit fixes that by generalizing the
constant prop heuristic from requiring one method match to
requiring one non-Bottom method match.

This issue caused a large slowdown in DelimitedFiles, where
the inner loop consists of

```
        while idx <= slen
            val,idx = iterate(dbuff, idx)
```

(cherry picked from commit 957b9c0)
KristofferC pushed a commit that referenced this issue Feb 11, 2019
In this case, the result of `iterate` has not been checked for
`nothing`, so we try to call `indexed_iterate` (for destructuring
assignment) on a Union of Nothing and the tuple returned by
`iterate`. That has two method matches, and so was excluded from
constant propagation. This commit fixes that by generalizing the
constant prop heuristic from requiring one method match to
requiring one non-Bottom method match.

This issue caused a large slowdown in DelimitedFiles, where
the inner loop consists of

```
        while idx <= slen
            val,idx = iterate(dbuff, idx)
```

(cherry picked from commit 957b9c0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:inference Type inference performance Must go faster regression Regression in behavior compared to a previous version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants