Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic CSV views #76

Merged
merged 26 commits into from
Jul 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
eb0aab0
make CSV generic over View
DivineDominion Apr 26, 2019
e2836c7
clean up public interface, adding documentation
DivineDominion Oct 25, 2016
8a40f76
Merge origin/master into generic-views
DivineDominion Jun 25, 2019
9758451
Merge master into generic-views
DivineDominion Aug 21, 2019
c5738a9
Merge master into generic-views
DivineDominion Aug 21, 2019
61d9e59
Merge v0.5.6 into generic-views
DivineDominion Jun 4, 2020
05b1e16
add more newline tests
DivineDominion Jun 4, 2020
8c65a81
fix typo
DivineDominion Jun 4, 2020
a8af063
fix missing cells at end of enumerated rows
DivineDominion Jun 4, 2020
95a52c2
Merge v0.6.1 into generic-views
DivineDominion Nov 22, 2021
ff8a4bd
drop "View" from NamedView/EnumeratedView
DivineDominion Nov 22, 2021
1650273
add tearDown
DivineDominion Nov 22, 2021
f52ef2c
remove deprecated initializer
DivineDominion Nov 22, 2021
d1343d4
rename parameter to "named" to fit NSImage.init
DivineDominion Nov 22, 2021
8c9d66e
respect row limit during parsing
DivineDominion Nov 22, 2021
d782490
add Podspec and Package.swift to Xcode project
DivineDominion Nov 22, 2021
0f8cdf1
rename test file
DivineDominion Nov 22, 2021
4dc2e56
rename test files to match implementation
DivineDominion Jul 1, 2022
7a83fb2
Merge v0.7.0 into generic-views
DivineDominion Jul 1, 2022
2d73697
Make tests 'throwing' so they don't cause an exception
DivineDominion Jul 1, 2022
b641951
fix test
DivineDominion Jul 1, 2022
ef13878
rename "Delimiter" to "CSVDelimiter" (cannot nest in generic CSV<T>)
DivineDominion Jul 1, 2022
79a6225
update README for merged code
DivineDominion Jul 1, 2022
3d56b8a
make column loading optional to reflect the saving from README
DivineDominion Jul 1, 2022
a84ef0d
delete unused file artifacts after merging
DivineDominion Jul 1, 2022
435e558
add semicolon delimiter guessing case reported as error
DivineDominion Jul 1, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 42 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,21 +21,21 @@ import SwiftCSV

do {
// As a string, guessing the delimiter
let csv: CSV = try CSV(string: "id,name,age\n1,Alice,18")
let csv: CSV = try CSV<Named>(string: "id,name,age\n1,Alice,18")

// Specifying a custom delimiter
let tsv: CSV = try CSV(string: "id\tname\tage\n1\tAlice\t18", delimiter: "\t")
let tsv: CSV = try CSV<Enumerated>(string: "id\tname\tage\n1\tAlice\t18", delimiter: .tab)

// From a file (propagating error during file loading)
let csvFile: CSV = try CSV(url: URL(fileURLWithPath: "path/to/users.csv"))
let csvFile: CSV = try CSV<Named>(url: URL(fileURLWithPath: "path/to/users.csv"))

// From a file inside the app bundle, with a custom delimiter, errors, and custom encoding.
// Note the result is an optional.
let resource: CSV? = try CSV(
let resource: CSV? = try CSV<Named>(
name: "users",
extension: "tsv",
bundle: .main,
delimiter: "\t",
delimiter: .character("🐠"), // Any character works!
encoding: .utf8)
} catch parseError as CSVParseError {
// Catch errors from parsing invalid CSV
Expand All @@ -52,22 +52,24 @@ The `CSV` class comes with initializers that are suited for loading files from U
extension CSV {
/// Load a CSV file from `url`.
///
/// - parameter url: URL of the file (will be passed to `String(contentsOfURL:encoding:)` to load)
/// - parameter delimiter: Character used to separate cells from one another in rows.
/// - parameter encoding: Character encoding to read file (default is `.utf8`)
/// - parameter loadColumns: Whether to populate the columns dictionary (default is `true`)
/// - throws: `CSVParseError` when parsing the contents of `url` fails, or file loading errors.
/// - Parameters:
/// - url: URL of the file (will be passed to `String(contentsOfURL:encoding:)` to load)
/// - delimiter: Character used to separate separate cells from one another in rows.
/// - encoding: Character encoding to read file (default is `.utf8`)
/// - loadColumns: Whether to populate the columns dictionary (default is `true`)
/// - Throws: `CSVParseError` when parsing the contents of `url` fails, or file loading errors.
public convenience init(url: URL,
delimiter: Delimiter,
delimiter: CSVDelimiter,
encoding: String.Encoding = .utf8,
loadColumns: Bool = true) throws

/// Load a CSV file from `url` and guess its delimiter from `CSV.recognizedDelimiters`, falling back to `.comma`.
///
/// - parameter url: URL of the file (will be passed to `String(contentsOfURL:encoding:)` to load)
/// - parameter encoding: Character encoding to read file (default is `.utf8`)
/// - parameter loadColumns: Whether to populate the columns dictionary (default is `true`)
/// - throws: `CSVParseError` when parsing the contents of `url` fails, or file loading errors.
/// - Parameters:
/// - url: URL of the file (will be passed to `String(contentsOfURL:encoding:)` to load)
/// - encoding: Character encoding to read file (default is `.utf8`)
/// - loadColumns: Whether to populate the columns dictionary (default is `true`)
/// - Throws: `CSVParseError` when parsing the contents of `url` fails, or file loading errors.
public convenience init(url: URL,
encoding: String.Encoding = .utf8,
loadColumns: Bool = true)
Expand All @@ -76,7 +78,7 @@ extension CSV {

### Delimiters

Delimiters are strongly typed. The recognized `CSV.Delimiter` cases are: `.comma`, `.semicolon`, and `.tab`.
Delimiters are strongly typed. The recognized `CSVDelimiter` cases are: `.comma`, `.semicolon`, and `.tab`.

You can use convenience initializers that guess the delimiter from the recognized list for you. These initializers are available for loading CSV from URLs and strings.

Expand All @@ -86,16 +88,16 @@ You can also use any other single-character delimiter when loading CSV data. A c

```swift
// Recognized the comma delimiter automatically:
let csv = CSV(string: "id,name,age\n1,Alice,18\n2,Bob,19")
let csv = CSV<Named>(string: "id,name,age\n1,Alice,18\n2,Bob,19")
csv.header //=> ["id", "name", "age"]
csv.namedRows //=> [["id": "1", "name": "Alice", "age": "18"], ["id": "2", "name": "Bob", "age": "19"]]
csv.namedColumns //=> ["id": ["1", "2"], "name": ["Alice", "Bob"], "age": ["18", "19"]]
csv.rows //=> [["id": "1", "name": "Alice", "age": "18"], ["id": "2", "name": "Bob", "age": "19"]]
csv.columns //=> ["id": ["1", "2"], "name": ["Alice", "Bob"], "age": ["18", "19"]]
```

The rows can also parsed and passed to a block on the fly, reducing the memory needed to store the whole lot in an array:

```swift
// Access each row as an array (array not guaranteed to be equal length to the header)
// Access each row as an array (inner array not guaranteed to always be equal length to the header)
csv.enumerateAsArray { array in
print(array.first)
}
Expand All @@ -107,17 +109,30 @@ csv.enumerateAsDict { dict in

### Skip Named Column Access for Large Data Sets

By default, the variants of `CSV.init` will populate its `namedColumns` and `enumeratedColumns` to provide access to the CSV data on a column-by-column basis. Think of this like a cross section:
Use `CSV<Named>` aka `NamedCSV` to access the CSV data on a column-by-column basis with named columns. Think of this like a cross section:

```swift
let csv = CSV(string: "id,name,age\n1,Alice,18\n2,Bob,19")
csv.namedRows[0]["name"] //=> "Alice"
csv.namedColumns["name"] //=> ["Alice", "Bob"]
let csv = NamedCSV(string: "id,name,age\n1,Alice,18\n2,Bob,19")
csv.rows[0]["name"] //=> "Alice"
csv.columns["name"] //=> ["Alice", "Bob"]
```

If you only want to access your data row-by-row, and not by-column, then you can set the `loadColumns` argument in any initializer to `false`. This will prevent the columnar data from being populated.
If you only want to access your data row-by-row, and not by-column, then you can use `CSV<Enumerated>` or `EnumeratedCSV`:

Skipping this step can increase performance for lots of data.
```swift
let csv = EnumeratedCSV(string: "id,name,age\n1,Alice,18\n2,Bob,19")
csv.rows[0][1] //=> "Alice"
csv.columns?[0].header //=> "name"
csv.columns?[0].rows //=> ["Alice", "Bob"]
```

To speed things up, skip populating by-column access completely by passing `loadColums: false`. This will prevent the columnar data from being populated. For large data sets, this saves a lot of iterations (at quadratic runtime).

```swift
let csv = EnumeratedCSV(string: "id,name,age\n1,Alice,18\n2,Bob,19", loadColumns: false)
csv.rows[0][1] //=> "Alice"
csv.columns //=> nil
```


## Installation
Expand All @@ -137,5 +152,5 @@ github "swiftcsv/SwiftCSV"
### SwiftPM

```
.package(url: "https://github.com/swiftcsv/SwiftCSV.git", from: "0.6.1")
.package(url: "https://github.com/swiftcsv/SwiftCSV.git", from: "0.8.0")
```
Loading