Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Expose StringBuilder.GetChunks() #29877

Merged
merged 6 commits into from
May 25, 2018

Conversation

vancem
Copy link
Contributor

@vancem vancem commented May 23, 2018

This allows efficient 'foreach' of the chunks in a StringBuilder.
Also added tests for the foreach.

This is the CoreFX part of of the CoreCLR PR: dotnet/coreclr#17530 which actually implemented the GetChunks() method. This change exposes it publicly and provides a test.

This addresses issue https://github.com/dotnet/corefx/issues/29770 where the API design was approved.

@danmosemsft @stephentoub

@@ -7621,6 +7621,16 @@ public sealed partial class StringBuilder : System.Runtime.Serialization.ISerial
void System.Runtime.Serialization.ISerializable.GetObjectData(System.Runtime.Serialization.SerializationInfo info, System.Runtime.Serialization.StreamingContext context) { }
public override string ToString() { throw null; }
public string ToString(int startIndex, int length) { throw null; }

// foreach over ReadOnlyMemory<char> support (IEnumerable-like interface)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can/should be deleted from the ref, as we're trying to get to a state where these files are easily machine re-generatable.

// Copy the string out (not using StringBuilder).
string outStr = "";
foreach (ReadOnlyMemory<char> chunk in inBuilder.EnumerateChunks())
outStr += new string(chunk.Span);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want any validation for the behavior when the StringBuilder is modified while its chunks are being enumerated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior is undefined when a StringBuilder is modified when enumeration is in progress. I will put a warning about this in EnumerateChunks().

I had considered actively throwing if this is detected, but this DOES slow the main Append() path down (not pay for play), and the value here is very low because this foreach API is a pretty rare use case for use by sophisticated users.

@@ -7621,6 +7621,13 @@ public sealed partial class StringBuilder : System.Runtime.Serialization.ISerial
void System.Runtime.Serialization.ISerializable.GetObjectData(System.Runtime.Serialization.SerializationInfo info, System.Runtime.Serialization.StreamingContext context) { }
public override string ToString() { throw null; }
public string ToString(int startIndex, int length) { throw null; }
public ChunkEnumerator EnumerateChunks() { throw null; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the API review I thought this was supposed to be called GetChunks?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right.

@davidfowl
Copy link
Member

davidfowl commented May 24, 2018

I missed the review here but is there any benefit to exposing this as a ReadOnlySequence<char>?

@vancem
Copy link
Contributor Author

vancem commented May 24, 2018

I missed the review here but is there any benefit to exposing this as a ReadOnlySequece?

There is a logistic layering problem because StringBuilder is in system.private.corelib and ReadOnlySequence is not there.

It also seems reasonable that if anyone wants ReadOnlySequence, that they build it using GetChunks().

@vancem vancem changed the title Expose StringBuilder.EnumerateChunks() Expose StringBuilder.GetChunks() May 24, 2018
foreach (ReadOnlyMemory<char> chunk in inBuilder.EnumerateChunks())
outStr += new string(chunk.Span);

// The strings formed by concatinating the chunks should be the same as the value in the StringBuilder.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: concatenating

{
public ChunkEnumerator GetEnumerator() { throw null; }
public bool MoveNext() { throw null; }
public ReadOnlyMemory<char> Current { get { throw null; } }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: reorder this to be in alphabetical order.

@weshaggard, we should also regenerate this ref to clean up some stuff.

For instance, the type isn't fully qualified here: https://github.com/dotnet/corefx/blob/master/src/System.Runtime/ref/System.Runtime.cs#L3743
And we are using abbreviated default syntax:
https://github.com/dotnet/corefx/blob/master/src/System.Runtime/ref/System.Runtime.cs#L5088
Lastly, overridden methods don't tend to have method bodies:
https://github.com/dotnet/corefx/blob/master/src/System.Runtime/ref/System.Runtime.cs#L5086

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vance you normalize this by running our GenerateReferenceSource target on the ref (https://github.com/dotnet/corefx/blob/master/dir.targets#L120). Just run msbuild /t:GenerateReferenceSource in the ref folder (note you will need to undo some changes to the file for the changes that are in the System.Runtime.manual.cs).

Copy link
Member

@ahsonkhan ahsonkhan May 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will also remove the enumerator APIs that are getting added in this PR. I think we need to account for System.Private.CoreLib.dll as well (maybe we need to build corefx using the local build of coreclr by passing the path to msbuild /p:CoreCLROverridePath)?
https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/developer-guide.md#testing-with-private-coreclr-bits

Can you please share any additional steps that might be needed to normalize the ref?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the new APIs are added manually and you just round-trip the system.runtime reference assembly they will not be lost and it will normalize the ref.

@vancem vancem force-pushed the StringBuilderEnumerateChunks branch 3 times, most recently from 98b5ad5 to 35f289c Compare May 25, 2018 16:30
@vancem vancem force-pushed the StringBuilderEnumerateChunks branch from 35f289c to 760f630 Compare May 25, 2018 16:36
Can be removed in a week after UWP is updated.
@vancem
Copy link
Contributor Author

vancem commented May 25, 2018

@dotnet-bot test Windows x86 Release Build

@vancem
Copy link
Contributor Author

vancem commented May 25, 2018

The Arm builds don't have machines to run them. This change has no architecture specific change and previous runs which are VERY similar have passed, so I think it is OK to check in. .

@vancem vancem merged commit a9ddb9b into dotnet:master May 25, 2018
@karelz karelz added this to the 3.0 milestone Jun 2, 2018
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Rename EumerateChunks->GetChunks
Add API baseline.

* Fix whitespace

* Fix spelling

* undo basline

* Put back lost rename

* Add baseline for new StringBuilder API GetChunks()

Can be removed in a week after UWP is updated.


Commit migrated from dotnet/corefx@a9ddb9b
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants