Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make getting style violations thread-safe and parallelize linting #1103

Merged
merged 11 commits into from
Jan 2, 2017

Conversation

jpsim
Copy link
Collaborator

@jpsim jpsim commented Dec 31, 2016

Addresses #1077. Thread sanitizer seems happy with this when running unit tests and linting a few projects I could find, including the Swift standard library.

Generally speeds up linting by ~30% to ~140% on my 2013 MacBook Pro, though the gains should be higher on more recent CPUs.

  • apple/swift: 218s to 166s (31% faster)
  • realm/realm-cocoa: 9.6s to 5.6s (71% faster)
  • Carthage/Carthage: 96s to 53s (80% faster)
  • jpsim/SourceKitten: 4.2s to 2.1s (100% faster)
  • realm/SwiftLint: 6.8s to 3.3s (136% faster)

I had initially added a --parallel flag to the lint command, but have since removed that option since I haven't run into any issues with it so far, and couldn't come up with a compelling reason to hide it behind an opt-in flag.

@SwiftLintBot
Copy link

SwiftLintBot commented Dec 31, 2016

1 Warning
⚠️ This PR may need tests.

Generated by 🚫 danger

let uuid = NSUUID().uuidString
let jobCount = Int(ceil(Double(count) / Double(maxConcurrentJobs)))

let queueLabelPrefix = "io.realm.SwiftLintFramework.map.\(uuid)"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious whether there is a reason you didn't use DispatchQueue.concurrentPerform here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't use it because I wasn't aware of it! Thanks for sharing 😊

The docs are non-existent, unfortunately: https://developer.apple.com/reference/dispatch/dispatchqueue/2016088-concurrentperform

But doing a bit of digging, the Dispatch Swift overlay seems to implement that function with dispatch_apply, which is well documented here: https://developer.apple.com/reference/dispatch/1452846-dispatch_apply_f

It seems to be exactly what I've reimplemented here, so I think I'll refactor to use that instead.

private var _allDeclarationsByType = [String: [String]]()
private var queueForRebuild = [Structure]()
private struct RebuildQueue {
private let lock = NSLock()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure here, but wouldn't using a queue be faster?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes you think that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the impression that NSLocks are really expensive, but can't find any good references right now

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, locking in general has considerable overhead, but that goes for whether you're using NSLock or GCD queues...

Actually, some of the patterns here could avoid some locking by using lock-free data structures, since they're append-only and we end up sorting them afterwards, but I couldn't find good resources on lock-free data structures in Swift, so I've punted on that for now.

@@ -16,6 +17,8 @@ struct LintCommand: CommandProtocol {
let verb = "lint"
let function = "Print lint warnings and errors (default command)"

private static let violationsAccumulatorQueue = DispatchQueue(label: "io.realm.swiftlint.violationsAccumulator")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know probably no one will use multiple linters at once, but shouldn't this queue be per instance?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 253e1f4.

@marcelofabri
Copy link
Collaborator

marcelofabri commented Dec 31, 2016

I got an exception in one of my runs:

Linting 'SchemaTests.swift' (29/41)
[2]    95692 illegal hardware instruction  
parallel_swiftlint/swiftlint lint

@marcelofabri
Copy link
Collaborator

And another one:

swiftlint(95787,0x7000043ed000) malloc: *** error for object 0x7f8e41915d18: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
[2]    95787 abort      parallel_swiftlint/swiftlint lint 

@marcelofabri
Copy link
Collaborator

If it helps, I'm running on a 6-Core 2013 Mac Pro.

@marcelofabri
Copy link
Collaborator

Actually this just happens with the --benchmark option.

also remove parallelForEach from SwiftLintFramework's public interface
since calling the GCD API is almost as concise and avoids polluting the
public API.

Finally, also remove the "fast" path for parallelMap since GCD should
do the reasonable thing for us here.
@jpsim
Copy link
Collaborator Author

jpsim commented Jan 1, 2017

Yeah, I had neglected to make recording benchmarks thread-safe. Done in 253e1f4.

@codecov-io
Copy link

codecov-io commented Jan 1, 2017

Current coverage is 82.41% (diff: 48.95%)

Merging #1103 into master will decrease coverage by 0.10%

@@             master      #1103   diff @@
==========================================
  Files           151        151          
  Lines          7209       7279    +70   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           5949       5999    +50   
- Misses         1260       1280    +20   
  Partials          0          0          

Powered by Codecov. Last update 6b2338c...3608cf4

Copy link
Collaborator

@marcelofabri marcelofabri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 👏 is 👏 awesome! 👏

@norio-nomura
Copy link
Collaborator

I have gotten crashes twice on linting:
https://gist.github.com/norio-nomura/e8dc253954ec8299e1e647aad3d99398

}
autoreleasepool {
Copy link
Collaborator

@norio-nomura norio-nomura Jan 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is autoreleasepool removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Nice catch! Re-added in 3608cf4.

@jpsim
Copy link
Collaborator Author

jpsim commented Jan 1, 2017

Are those crashes reproducible?

@jpsim
Copy link
Collaborator Author

jpsim commented Jan 2, 2017

@norio-nomura what project(s) were you linting when you experienced these crashes?

@norio-nomura
Copy link
Collaborator

I experienced on Carthage. It happens rarely. But it happened first and second try with this PR.

@jpsim
Copy link
Collaborator Author

jpsim commented Jan 2, 2017

Hmmm, I ran this with Thread Sanitizer enabled against Carthage with all submodules recursively cloned 5 times in a row without hitting any issues. Xcode 8.2 with just a bare swiftlint command, no extra configuration files or CLI arguments.

The only change I've made was to enable all opt-in rules:

--- a/Source/SwiftLintFramework/Models/Configuration.swift
+++ b/Source/SwiftLintFramework/Models/Configuration.swift
@@ -91,11 +91,7 @@ public struct Configuration: Equatable {
                 return whitelistRules.contains(type(of: rule).description.identifier)
             }
         } else {
-            rules = configuredRules.filter { rule in
-                let id = type(of: rule).description.identifier
-                if validDisabledRules.contains(id) { return false }
-                return optInRules.contains(id) || !(rule is OptInRule)
-            }
+            rules = configuredRules
         }

result.append(jobIndexAndResults)
}
}
return result.sorted { $0.0 < $1.0 }.flatMap { $0.1 }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the flatMap correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, should've been changed to map in 8774a92. Fixed in 53d0cf6. Thanks!

}
lock.unlock()
let value = factory(file)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

factory() invocation might be happen in parallel for same key file because it unlocked() while running factory().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That shouldn't be a problem when factory is already thread-safe. But anyway, I've refactored to perform the factory while locked in 8906da6, to make sure this keeps working if factory is ever not thread-safe.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for change.

I thought that it should be locked for functionality of Cache even if factory would be already thread-safe. Because cached contents possibly be reference type, and previous implementation could return difference instance for same key.

jpsim added 2 commits January 2, 2017 09:52
* master:
  use first(where:) instead of for loop in RulesCommand.swift
  remove unnecessary imports from AutoCorrectCommand.swift
  Typo in changelog
  Remove unnecessary import
  Avoid duplicating default value
  Update number_separator configuration in SwiftLint
  number_separator can be configured with a min length
  Enable more opt-in rules in SwiftLint
  Fix out of range exception in AttributesRule
  Fix trailing_comma crash
@jpsim
Copy link
Collaborator Author

jpsim commented Jan 2, 2017

I appreciate everyone's reviews on this! Thanks @norio-nomura, @marcelofabri and special guest @krzysztofzablocki 🙏.

I think I'll be merging this despite @norio-nomura's crashes, since neither of us appear to be able to reproduce them now.

@jpsim jpsim merged commit 2ff0348 into master Jan 2, 2017
@jpsim jpsim deleted the jp-parallelize branch January 2, 2017 18:12
@marcelofabri marcelofabri mentioned this pull request Jan 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants