From f14919edc781a6cfd7638dc882223a598fb76044 Mon Sep 17 00:00:00 2001 From: KOTP Date: Fri, 15 Mar 2024 23:46:40 -0400 Subject: [PATCH] Clarify relationship of Strings and Runes As discussed in [the forums], this patch makes the changes desired to clarify how Strings are related to Runes and hopefully clears up some confusing and potentially misleading statements. Ref: #2768 [the forums]: https://forum.exercism.org/t/potential-misleading-information-on-the-golang-runes-chapter/10082/1 --- concepts/runes/about.md | 9 +++++---- concepts/runes/introduction.md | 9 +++++---- 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/concepts/runes/about.md b/concepts/runes/about.md index 07f5844cb..c0d845d5c 100644 --- a/concepts/runes/about.md +++ b/concepts/runes/about.md @@ -83,12 +83,13 @@ fmt.Printf("myRune Unicode character: %c\n", myRune) ## Runes and Strings Strings in Go are encoded using UTF-8 which means they contain Unicode characters. -Since the `rune` type represents a Unicode character, a string in Go is often referred to as a sequence of runes. -However, runes are stored as 1, 2, 3, or 4 bytes depending on the character. -Due to this, strings are really just a sequence of bytes. -In Go, slices are used to represent sequences and these slices can be iterated over using `range`. +Characters in strings are stored and encoded as 1, 2, 3, or 4 bytes depending on the Unicode character they represent. + +In Go, slices are used to represent sequences and these slices can be iterated over using range. +When we iterate over a string, Go converts the string into a series of Runes, each of which is 4 bytes (remember, the rune type is an alias for an `int32`!) Even though a string is just a slice of bytes, the `range` keyword iterates over a string's runes, not its bytes. + In this example, the `index` variable represents the starting index of the current rune's byte sequence and the `char` variable represents the current rune: ```go diff --git a/concepts/runes/introduction.md b/concepts/runes/introduction.md index 57dd0781e..67e2c9aa1 100644 --- a/concepts/runes/introduction.md +++ b/concepts/runes/introduction.md @@ -74,12 +74,13 @@ fmt.Printf("myRune Unicode code point: %U\n", myRune) ## Runes and Strings Strings in Go are encoded using UTF-8 which means they contain Unicode characters. -Since the `rune` type represents a Unicode character, a string in Go is often referred to as a sequence of runes. -However, runes are stored as 1, 2, 3, or 4 bytes depending on the character. -Due to this, strings are really just a sequence of bytes. -In Go, slices are used to represent sequences and these slices can be iterated over using `range`. +Characters in strings are stored and encoded as 1, 2, 3, or 4 bytes depending on the Unicode character they represent. + +In Go, slices are used to represent sequences and these slices can be iterated over using range. +When we iterate over a string, Go converts the string into a series of Runes, each of which is 4 bytes (remember, the rune type is an alias for an `int32`!) Even though a string is just a slice of bytes, the `range` keyword iterates over a string's runes, not its bytes. + In this example, the `index` variable represents the starting index of the current rune's byte sequence and the `char` variable represents the current rune: ```go