Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that bundle identifiers are valid URL host components #1069

Merged
merged 21 commits into from
Nov 21, 2024

Conversation

d-ronnqvist
Copy link
Contributor

@d-ronnqvist d-ronnqvist commented Oct 22, 2024

Bug/issue #, if applicable: rdar://135335645 #290

Summary

This ensures that all bundle identifiers strings are valid URL host components by replacing connective sequences of unsupported characters with "-". This happens in a new DocumentationBundle.Identifier type.

That change is rather small. The rest of this PR updates all the calling code and API to use this type safe API instead of string identifiers to ensure that every place that creates a bundle identifier creates a safe one.

Note: because some public API already defined var identifier: String or var bundleIdentifier: String I couldn't use the name "identifier" to refer to the new DocumentationBundle.Identifier variable without source breaking changes. Instead, I opted to use var id: DocumentationBundle.Identifier or var bundleID: DocumentationBundle.Identifier for the new variables.

Dependencies

None

Testing

  • Build any documentation and pass --fallback-bundle-identifier "Some value with spaces".
  • DocC should successfully build the documentation.

Checklist

Make sure you check off the following items. If they cannot be completed, provide a reason.

  • Added tests
  • Ran the ./bin/test script and it succeeded
  • [ ] Updated documentation if necessary

@d-ronnqvist d-ronnqvist force-pushed the validate-bundle-identifier branch 2 times, most recently from 50e9c91 to f86a125 Compare October 24, 2024 11:56
@d-ronnqvist d-ronnqvist force-pushed the validate-bundle-identifier branch from f86a125 to 4805a05 Compare October 25, 2024 08:42
@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

@sofiaromorales
Copy link
Contributor

Thanks for the excellent commit history David! This makes it so easy to review the PR.

@sofiaromorales
Copy link
Contributor

I’ve reviewed up to this commit and will go through the rest tomorrow.

Overall, it looks good. However, there are instances where the renaming seems unnecessary and adds confusion by using id as a generic attribute name without clarifying that it refers to the bundle identifier.

Copy link
Contributor

@sofiaromorales sofiaromorales left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like part of the comment I did earlier got addressed in a follow up commit (4805a05). I still recommend that we go over the other uses of id and rename them to bundleId or similar.

Besides that this looks good to me!

/// The bundle identifier for the reference resolver in the other process.
public let bundleIdentifier: String
public let id: DocumentationBundle.Identifier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rename this to `bundleIdentifier.

Naming this id can be confusing when invoking the property like outOfProcessReferenceResolver.id since it might convey that this is calling the identifier of the reference resolver process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't name it bundleIdentifier because there's already a property with that name but I can name it bundleID.

Like I said in this comment I'm planning on deprecating the term "bundle" because it's causing confusion and I was trying to avoid needing to introduce and deprecate a public API in the same release but I don't think there's would be too many places like this, so perhaps that's not too bad.

Comment on lines 210 to 216
public init(bundleIdentifier: String, path: String, fragment: String? = nil, sourceLanguage: SourceLanguage) {
self.init(bundleIdentifier: bundleIdentifier, path: path, fragment: fragment, sourceLanguages: [sourceLanguage])
public init(id: DocumentationBundle.Identifier, path: String, fragment: String? = nil, sourceLanguage: SourceLanguage) {
self.init(id: id, path: path, fragment: fragment, sourceLanguages: [sourceLanguage])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not possible to keep the function signature the same and only change the argument type? For this initialiser I find useful to explicitly convey that the identifier is the id of the bundle, not of the model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's possible to keep the parameter names the same (the "function signature" is the parameter and return types).

I do have (separate) plans to deprecate the "documentation bundle" term because it's confusing. It's often used to mean "documentation catalog" but they are not the same, neither in structure or purpose.

I remove the phase "bundle" in this new public API so that I wouldn't need to introduce and deprecate it in the same release but perhaps I'm getting ahead of myself and it would be better to make that change when we deprecate the term "bundle".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in f97e6d3

@@ -169,10 +169,10 @@ private final class FallbackResolverBasedLinkResolver {

private func resolve(_ unresolvedReference: UnresolvedTopicReference, in parent: ResolvedTopicReference, fromSymbolLink isCurrentlyResolvingSymbolLink: Bool, context: DocumentationContext) -> TopicReferenceResolutionResult? {
// Check if a fallback reference resolver should resolve this
let referenceBundleIdentifier = unresolvedReference.bundleIdentifier ?? parent.bundleIdentifier
let referenceID = unresolvedReference.id ?? parent.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same goes here. I find it clearer to keep it as referenceBundleIdentifier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 9478d47

Comment on lines +21 to +33
get {
var result = [BundleIdentifier: ExternalDocumentationSource]()
for (key, value) in configuration.externalDocumentationConfiguration.sources {
result[key.rawValue] = value
}
return result
}
set {
configuration.externalDocumentationConfiguration.sources.removeAll()
for (key, value) in newValue {
configuration.externalDocumentationConfiguration.sources[.init(rawValue: key)] = value
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this code change needed as part of this PR? this seems to be handling separate logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the storage that this was accessing to changed from [String: ExternalDocumentationSource] to [DocumentationBundle.Identifier: ExternalDocumentationSource] it needs to map all the keys of the storage to and from keys any read or write access to this deprecated property.

@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

@d-ronnqvist d-ronnqvist merged commit 75344ae into swiftlang:main Nov 21, 2024
2 checks passed
@d-ronnqvist d-ronnqvist deleted the validate-bundle-identifier branch November 21, 2024 13:13
d-ronnqvist added a commit to d-ronnqvist/swift-docc that referenced this pull request Dec 6, 2024
…ng#1069)

* Add bundle identifier type that ensures a valid URL host component

rdar://135335645

* Use new Identifier type in `DocumentationBundle/Info`

* Update code to not use deprecated `Info/identifier` property

* Use new Identifier type in `DocumentationBundle`

* Update code to not use deprecated `identifier` property

* Use new Identifier type in `[Unresolved|Resolved]TopicReference`

* Update code to not use deprecated topic reference `bundleIdentifier` property

* Deprecate `BundleIdentifier` in favor of `DocumentationBundle/Identifier`

* Use new Identifier type in `BuildMetadata`

* Use new Identifier type in `AssetReference`

* Use new Identifier type in `ResourceReference`

* Use new Identifier type in `ConvertServiceFallbackResolver`

* Use new Identifier type in `SerializableLinkResolutionInformation`

* Use new Identifier type in `ConvertAction/Indexer`

* Prefer `bundleID` over `id` for types that scoped inside bundle

* Use `bundleID` instead of `id` for property names outside the bundle

* Use "bundle id" in local variable in fallback resolver
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants