Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

striding(by:) sometimes slower than expected #63

Closed
timvermeulen opened this issue Jan 16, 2021 · 4 comments · Fixed by #88
Closed

striding(by:) sometimes slower than expected #63

timvermeulen opened this issue Jan 16, 2021 · 4 comments · Fixed by #88

Comments

@timvermeulen
Copy link
Contributor

This code should be fast, but isn't:

for x in (0..<10_000_000).striding(by: 1_000_000) {
  // ...
}

Stride's iterator wraps the base collection's iterator, and can't skip over multiple elements at a time. By forcing the iterator to be an IndexingIterator, the code is fast again:

for x in (0..<10_000_000).striding(by: 1_000_000)[...] {
  // ...
}

I don't think we can vary the Stride.Iterator associated type based on whether Base conforms to Collection, so the only way to fix this that I can think of is to split up Stride into two types (with two overloads), one for sequences and one for collections.

@ollieatkinson
Copy link
Contributor

Thanks very much for raising this @timvermeulen - this is certainly not the performance one would expect when using this sequence. Nice catch 🎣 !

As you mention - we can implement the improved algorithm by abstracting away the iterator into it's own type to allow for the optimisation to apply in the case where Base is Collection.

extension Sequence {
    public func striding(by step: Int) -> StrideSequence<Self> {
        StrideSequence(base: self, stride: step)
    }
}

extension Collection {
    public func striding(by step: Int) -> Stride<Self> {
        Stride(base: self, stride: step)
    }
}

public struct StrideSequence<Base: Sequence>: Sequence {
    
    let base: Base
    let stride: Int
    
    init(base: Base, stride: Int) {
        precondition(stride > 0, "striding must be greater than zero")
        self.base = base
        self.stride = stride
    }
}

extension StrideSequence {
    
    public struct Iterator: IteratorProtocol {
        
        var iterator: Base.Iterator
        let stride: Int
        var striding: Bool = false
        
        public mutating func next() -> Base.Element? {
            guard striding else {
                striding = true
                return iterator.next()
            }
            for _ in 0..<stride - 1 {
                guard iterator.next() != nil else { break }
            }
            return iterator.next()
        }
    }
    
    public func makeIterator() -> StrideSequence<Base>.Iterator {
        Iterator(iterator: base.makeIterator(), stride: stride)
    }
}

@ollieatkinson
Copy link
Contributor

I had a go at implementing this change (ollieatkinson@70fead5) but I'm unsure how best to validate the improvement.

When running similar examples to the one you gave with measure I get the following results:

  func testMeasureSequence() {
    measure {
      for _ in (0..<100_000_000).striding(by: 1_000) { }
    }
  }
  
  func testMeasureCollection() {
    measure {
      for _ in (0..<100_000_000).striding(by: 1_000)[...] { }
    }
  }

image
image

↑ This is the opposite to what I would expect

@natecook1000
Copy link
Member

Both of those look like they're using the collection version. You can opt into the sequence like this: let s: StrideSequence = (0..<100_000_000).striding(by: 1_000)

@ollieatkinson
Copy link
Contributor

Of course! Thanks, I've raised the P/R - I didn't commit the performance tests as I was unsure since there was no other examples of this being done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants