Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrape imslp api #24

Open
wants to merge 61 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
642758d
exploratory scraping
mbrandt00 Nov 15, 2024
1316b84
Add pytest, static test file
mbrandt00 Nov 16, 2024
2fce02f
Add parse_key_signature with test
mbrandt00 Nov 16, 2024
7233bf2
update ruff config
mbrandt00 Nov 16, 2024
ab954e7
Implement parse key signature into movements_parsing with test
mbrandt00 Nov 16, 2024
467a0a9
rename static piece
mbrandt00 Nov 16, 2024
ff66901
regex to split up sub piece number and title
mbrandt00 Nov 16, 2024
d6c5ab7
Add URL to movements list for highest download
mbrandt00 Nov 16, 2024
956ce2d
Add movement test, clean up parse_movement function
mbrandt00 Nov 17, 2024
cc84004
Convert utils to module with private helper methods
mbrandt00 Nov 17, 2024
ed1908f
WIP collection iteration with selenium
mbrandt00 Nov 17, 2024
d7401c7
lint and cleanup
mbrandt00 Nov 17, 2024
586ebd3
Add pagination on collections with test
mbrandt00 Nov 17, 2024
b937950
Simplify imports to use namespaced module
mbrandt00 Nov 17, 2024
30dbc7c
config changes
mbrandt00 Nov 17, 2024
963b2b3
Move common logic to helpers
mbrandt00 Nov 18, 2024
fa4b73f
WIP parse_metadata fn for imslp parsing
mbrandt00 Nov 18, 2024
240e42f
Parse opus type number
mbrandt00 Nov 19, 2024
3ee2984
create typed dictionary, parse more attributes
mbrandt00 Nov 19, 2024
75ca6c9
Add instrumentation, composition_year (int) composition_year (str), test
mbrandt00 Nov 19, 2024
ab58fe0
fix typo
mbrandt00 Nov 20, 2024
6002c1c
Add nickname and instrumentation parsing
mbrandt00 Nov 20, 2024
f495624
add instrumentation and piece style
mbrandt00 Nov 20, 2024
135101e
wip
mbrandt00 Nov 23, 2024
36a6a4d
Add catalogue_number_secondary
mbrandt00 Nov 23, 2024
19654bf
use data class in movements in initialization
mbrandt00 Nov 24, 2024
cd6bbb9
Fix lint warnings
mbrandt00 Nov 24, 2024
f7725b0
handle edge case where movement string has two sets of parens
mbrandt00 Nov 24, 2024
481dcde
parse piece movement count and type from top header
mbrandt00 Nov 24, 2024
aeba9e5
lint
mbrandt00 Nov 24, 2024
1917415
Add visualization tools, handle sub piece type parsing bug
mbrandt00 Nov 24, 2024
16df9c4
parse nicknames in movements, key signatures without parens
mbrandt00 Nov 25, 2024
72acb4d
Make parse_movements accept url instead of bs4 tag
mbrandt00 Nov 27, 2024
9c03e78
Remove html testing static files in favor of url
mbrandt00 Nov 27, 2024
5fb7cad
handle more edge cases in parse movement
mbrandt00 Nov 27, 2024
7040ae4
Handle more edge cases
mbrandt00 Nov 27, 2024
2c1a034
more edge cases
mbrandt00 Nov 30, 2024
823b9bb
WIP write to db
mbrandt00 Nov 30, 2024
6ac793b
More safeguards
mbrandt00 Nov 30, 2024
228b9bf
Make parsing catalogue type more robust, match sql enums
mbrandt00 Nov 30, 2024
adb7705
More normalization
mbrandt00 Nov 30, 2024
899598a
fix sub piece type parsing edge case
mbrandt00 Nov 30, 2024
b28f2b0
add unnaccent extension
mbrandt00 Nov 30, 2024
26638cd
more edge cases
mbrandt00 Nov 30, 2024
07912b9
Index nicknames when there are '/' correctly
mbrandt00 Nov 30, 2024
473ccf4
Add imslp schema, search on imslp pieces in graphql
mbrandt00 Dec 1, 2024
39c07d0
Add key signature and opus number/type to search
mbrandt00 Dec 1, 2024
00b22ca
Remove apple music dependency! Remove parsing functions, add imslp se…
mbrandt00 Dec 2, 2024
a1f5431
wip converting piece edit to use gql
mbrandt00 Dec 8, 2024
5f7d414
Add selection set initializers to codegen config for previews
mbrandt00 Dec 8, 2024
f4f8337
Add more fields to PieceDetails fragments, display in PieceEdit view,…
mbrandt00 Dec 8, 2024
f16aa1f
Handle more edge cases
mbrandt00 Dec 8, 2024
3ca5d65
Add constraint on imslp.pieces for uniqueness on imslp_url
mbrandt00 Dec 8, 2024
c3a36dc
remove tempo markings from movements name
mbrandt00 Dec 10, 2024
39c34bb
fix on conflict handler
mbrandt00 Dec 10, 2024
42072fa
Handle edge case
mbrandt00 Dec 12, 2024
72bad46
Move method for fetching from imslp to util script
mbrandt00 Dec 12, 2024
22198eb
Format files
mbrandt00 Dec 12, 2024
c894eff
Add functions to fetch and write to pg
mbrandt00 Dec 12, 2024
4b33954
Update catalogue type enums
mbrandt00 Dec 16, 2024
0e373b3
Update piece edit view
mbrandt00 Dec 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
.env
.idea
*.xcuserdatad
*.xcbkptlist
3 changes: 3 additions & 0 deletions ApolloGQL/Mocks/Movements+Mock.graphql.swift
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ public class Movements: MockObject {
public struct MockFields {
@Field<ApolloGQL.BigInt>("id") public var id
@Field<String>("name") public var name
@Field<String>("nickname") public var nickname
@Field<Int>("number") public var number
}
}
Expand All @@ -20,11 +21,13 @@ public extension Mock where O == Movements {
convenience init(
id: ApolloGQL.BigInt? = nil,
name: String? = nil,
nickname: String? = nil,
number: Int? = nil
) {
self.init()
_setScalar(id, for: \.id)
_setScalar(name, for: \.name)
_setScalar(nickname, for: \.nickname)
_setScalar(number, for: \.number)
}
}
24 changes: 24 additions & 0 deletions ApolloGQL/Mocks/Pieces+Mock.graphql.swift
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,17 @@ public class Pieces: MockObject {
@Field<Int>("catalogueNumber") public var catalogueNumber
@Field<GraphQLEnum<ApolloGQL.CatalogueType>>("catalogueType") public var catalogueType
@Field<Composers>("composer") public var composer
@Field<ApolloGQL.BigInt>("composerId") public var composerId
@Field<Int>("compositionYear") public var compositionYear
@Field<GraphQLEnum<ApolloGQL.PieceFormat>>("format") public var format
@Field<ApolloGQL.BigInt>("id") public var id
@Field<String>("imslpUrl") public var imslpUrl
@Field<[String?]>("instrumentation") public var instrumentation
@Field<GraphQLEnum<ApolloGQL.KeySignatureType>>("keySignature") public var keySignature
@Field<MovementsConnection>("movements") public var movements
@Field<MovementsConnection>("movementsCollection") public var movementsCollection
@Field<String>("nickname") public var nickname
@Field<String>("wikipediaUrl") public var wikipediaUrl
@Field<String>("workName") public var workName
}
}
Expand All @@ -25,18 +33,34 @@ public extension Mock where O == Pieces {
catalogueNumber: Int? = nil,
catalogueType: GraphQLEnum<ApolloGQL.CatalogueType>? = nil,
composer: Mock<Composers>? = nil,
composerId: ApolloGQL.BigInt? = nil,
compositionYear: Int? = nil,
format: GraphQLEnum<ApolloGQL.PieceFormat>? = nil,
id: ApolloGQL.BigInt? = nil,
imslpUrl: String? = nil,
instrumentation: [String]? = nil,
keySignature: GraphQLEnum<ApolloGQL.KeySignatureType>? = nil,
movements: Mock<MovementsConnection>? = nil,
movementsCollection: Mock<MovementsConnection>? = nil,
nickname: String? = nil,
wikipediaUrl: String? = nil,
workName: String? = nil
) {
self.init()
_setScalar(catalogueNumber, for: \.catalogueNumber)
_setScalar(catalogueType, for: \.catalogueType)
_setEntity(composer, for: \.composer)
_setScalar(composerId, for: \.composerId)
_setScalar(compositionYear, for: \.compositionYear)
_setScalar(format, for: \.format)
_setScalar(id, for: \.id)
_setScalar(imslpUrl, for: \.imslpUrl)
_setScalarList(instrumentation, for: \.instrumentation)
_setScalar(keySignature, for: \.keySignature)
_setEntity(movements, for: \.movements)
_setEntity(movementsCollection, for: \.movementsCollection)
_setScalar(nickname, for: \.nickname)
_setScalar(wikipediaUrl, for: \.wikipediaUrl)
_setScalar(workName, for: \.workName)
}
}
3 changes: 3 additions & 0 deletions ApolloGQL/Mocks/Query+Mock.graphql.swift
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ public class Query: MockObject {
public struct MockFields {
@Field<PiecesConnection>("piecesCollection") public var piecesCollection
@Field<PracticeSessionsConnection>("practiceSessionsCollection") public var practiceSessionsCollection
@Field<PiecesConnection>("searchImslpPieces") public var searchImslpPieces
@Field<PiecesConnection>("searchPieceWithAssociations") public var searchPieceWithAssociations
}
}
Expand All @@ -20,11 +21,13 @@ public extension Mock where O == Query {
convenience init(
piecesCollection: Mock<PiecesConnection>? = nil,
practiceSessionsCollection: Mock<PracticeSessionsConnection>? = nil,
searchImslpPieces: Mock<PiecesConnection>? = nil,
searchPieceWithAssociations: Mock<PiecesConnection>? = nil
) {
self.init()
_setEntity(piecesCollection, for: \.piecesCollection)
_setEntity(practiceSessionsCollection, for: \.practiceSessionsCollection)
_setEntity(searchImslpPieces, for: \.searchImslpPieces)
_setEntity(searchPieceWithAssociations, for: \.searchPieceWithAssociations)
}
}
2 changes: 1 addition & 1 deletion ApolloGQL/Package.swift
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// swift-tools-version:5.7
// swift-tools-version:5.9

import PackageDescription

Expand Down
114 changes: 112 additions & 2 deletions ApolloGQL/Sources/Fragments/PieceDetails.graphql.swift
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

public struct PieceDetails: ApolloGQL.SelectionSet, Fragment {
public static var fragmentDefinition: StaticString {
#"fragment PieceDetails on Pieces { __typename id workName catalogueType catalogueNumber nickname composer { __typename name } movements: movementsCollection(orderBy: [{ number: DescNullsLast }]) { __typename edges { __typename node { __typename id name number } } } }"#
#"fragment PieceDetails on Pieces { __typename id workName catalogueType keySignature format instrumentation wikipediaUrl imslpUrl compositionYear catalogueNumber nickname composer { __typename name } movements: movementsCollection(orderBy: [{ number: AscNullsLast }]) { __typename edges { __typename node { __typename id name number } } } }"#
}

public let __data: DataDict
Expand All @@ -17,20 +17,70 @@ public struct PieceDetails: ApolloGQL.SelectionSet, Fragment {
.field("id", ApolloGQL.BigInt.self),
.field("workName", String.self),
.field("catalogueType", GraphQLEnum<ApolloGQL.CatalogueType>?.self),
.field("keySignature", GraphQLEnum<ApolloGQL.KeySignatureType>?.self),
.field("format", GraphQLEnum<ApolloGQL.PieceFormat>?.self),
.field("instrumentation", [String?]?.self),
.field("wikipediaUrl", String?.self),
.field("imslpUrl", String?.self),
.field("compositionYear", Int?.self),
.field("catalogueNumber", Int?.self),
.field("nickname", String?.self),
.field("composer", Composer?.self),
.field("movementsCollection", alias: "movements", Movements?.self, arguments: ["orderBy": [["number": "DescNullsLast"]]]),
.field("movementsCollection", alias: "movements", Movements?.self, arguments: ["orderBy": [["number": "AscNullsLast"]]]),
] }

public var id: ApolloGQL.BigInt { __data["id"] }
public var workName: String { __data["workName"] }
public var catalogueType: GraphQLEnum<ApolloGQL.CatalogueType>? { __data["catalogueType"] }
public var keySignature: GraphQLEnum<ApolloGQL.KeySignatureType>? { __data["keySignature"] }
public var format: GraphQLEnum<ApolloGQL.PieceFormat>? { __data["format"] }
public var instrumentation: [String?]? { __data["instrumentation"] }
public var wikipediaUrl: String? { __data["wikipediaUrl"] }
public var imslpUrl: String? { __data["imslpUrl"] }
public var compositionYear: Int? { __data["compositionYear"] }
public var catalogueNumber: Int? { __data["catalogueNumber"] }
public var nickname: String? { __data["nickname"] }
public var composer: Composer? { __data["composer"] }
public var movements: Movements? { __data["movements"] }

public init(
id: ApolloGQL.BigInt,
workName: String,
catalogueType: GraphQLEnum<ApolloGQL.CatalogueType>? = nil,
keySignature: GraphQLEnum<ApolloGQL.KeySignatureType>? = nil,
format: GraphQLEnum<ApolloGQL.PieceFormat>? = nil,
instrumentation: [String?]? = nil,
wikipediaUrl: String? = nil,
imslpUrl: String? = nil,
compositionYear: Int? = nil,
catalogueNumber: Int? = nil,
nickname: String? = nil,
composer: Composer? = nil,
movements: Movements? = nil
) {
self.init(_dataDict: DataDict(
data: [
"__typename": ApolloGQL.Objects.Pieces.typename,
"id": id,
"workName": workName,
"catalogueType": catalogueType,
"keySignature": keySignature,
"format": format,
"instrumentation": instrumentation,
"wikipediaUrl": wikipediaUrl,
"imslpUrl": imslpUrl,
"compositionYear": compositionYear,
"catalogueNumber": catalogueNumber,
"nickname": nickname,
"composer": composer._fieldData,
"movements": movements._fieldData,
],
fulfilledFragments: [
ObjectIdentifier(PieceDetails.self)
]
))
}

/// Composer
///
/// Parent Type: `Composers`
Expand All @@ -45,6 +95,20 @@ public struct PieceDetails: ApolloGQL.SelectionSet, Fragment {
] }

public var name: String { __data["name"] }

public init(
name: String
) {
self.init(_dataDict: DataDict(
data: [
"__typename": ApolloGQL.Objects.Composers.typename,
"name": name,
],
fulfilledFragments: [
ObjectIdentifier(PieceDetails.Composer.self)
]
))
}
}

/// Movements
Expand All @@ -62,6 +126,20 @@ public struct PieceDetails: ApolloGQL.SelectionSet, Fragment {

public var edges: [Edge] { __data["edges"] }

public init(
edges: [Edge]
) {
self.init(_dataDict: DataDict(
data: [
"__typename": ApolloGQL.Objects.MovementsConnection.typename,
"edges": edges._fieldData,
],
fulfilledFragments: [
ObjectIdentifier(PieceDetails.Movements.self)
]
))
}

/// Movements.Edge
///
/// Parent Type: `MovementsEdge`
Expand All @@ -77,6 +155,20 @@ public struct PieceDetails: ApolloGQL.SelectionSet, Fragment {

public var node: Node { __data["node"] }

public init(
node: Node
) {
self.init(_dataDict: DataDict(
data: [
"__typename": ApolloGQL.Objects.MovementsEdge.typename,
"node": node._fieldData,
],
fulfilledFragments: [
ObjectIdentifier(PieceDetails.Movements.Edge.self)
]
))
}

/// Movements.Edge.Node
///
/// Parent Type: `Movements`
Expand All @@ -95,6 +187,24 @@ public struct PieceDetails: ApolloGQL.SelectionSet, Fragment {
public var id: ApolloGQL.BigInt { __data["id"] }
public var name: String? { __data["name"] }
public var number: Int? { __data["number"] }

public init(
id: ApolloGQL.BigInt,
name: String? = nil,
number: Int? = nil
) {
self.init(_dataDict: DataDict(
data: [
"__typename": ApolloGQL.Objects.Movements.typename,
"id": id,
"name": name,
"number": number,
],
fulfilledFragments: [
ObjectIdentifier(PieceDetails.Movements.Edge.Node.self)
]
))
}
}
}
}
Expand Down
85 changes: 85 additions & 0 deletions ApolloGQL/Sources/Fragments/PracticeSessionDetails.graphql.swift
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,28 @@ public struct PracticeSessionDetails: ApolloGQL.SelectionSet, Fragment {
public var movement: Movement? { __data["movement"] }
public var piece: Piece { __data["piece"] }

public init(
id: ApolloGQL.BigInt,
startTime: ApolloGQL.Datetime,
endTime: ApolloGQL.Datetime? = nil,
movement: Movement? = nil,
piece: Piece
) {
self.init(_dataDict: DataDict(
data: [
"__typename": ApolloGQL.Objects.PracticeSessions.typename,
"id": id,
"startTime": startTime,
"endTime": endTime,
"movement": movement._fieldData,
"piece": piece._fieldData,
],
fulfilledFragments: [
ObjectIdentifier(PracticeSessionDetails.self)
]
))
}

/// Movement
///
/// Parent Type: `Movements`
Expand All @@ -45,6 +67,24 @@ public struct PracticeSessionDetails: ApolloGQL.SelectionSet, Fragment {
public var id: ApolloGQL.BigInt { __data["id"] }
public var name: String? { __data["name"] }
public var number: Int? { __data["number"] }

public init(
id: ApolloGQL.BigInt,
name: String? = nil,
number: Int? = nil
) {
self.init(_dataDict: DataDict(
data: [
"__typename": ApolloGQL.Objects.Movements.typename,
"id": id,
"name": name,
"number": number,
],
fulfilledFragments: [
ObjectIdentifier(PracticeSessionDetails.Movement.self)
]
))
}
}

/// Piece
Expand All @@ -63,6 +103,12 @@ public struct PracticeSessionDetails: ApolloGQL.SelectionSet, Fragment {
public var id: ApolloGQL.BigInt { __data["id"] }
public var workName: String { __data["workName"] }
public var catalogueType: GraphQLEnum<ApolloGQL.CatalogueType>? { __data["catalogueType"] }
public var keySignature: GraphQLEnum<ApolloGQL.KeySignatureType>? { __data["keySignature"] }
public var format: GraphQLEnum<ApolloGQL.PieceFormat>? { __data["format"] }
public var instrumentation: [String?]? { __data["instrumentation"] }
public var wikipediaUrl: String? { __data["wikipediaUrl"] }
public var imslpUrl: String? { __data["imslpUrl"] }
public var compositionYear: Int? { __data["compositionYear"] }
public var catalogueNumber: Int? { __data["catalogueNumber"] }
public var nickname: String? { __data["nickname"] }
public var composer: Composer? { __data["composer"] }
Expand All @@ -75,6 +121,45 @@ public struct PracticeSessionDetails: ApolloGQL.SelectionSet, Fragment {
public var pieceDetails: PieceDetails { _toFragment() }
}

public init(
id: ApolloGQL.BigInt,
workName: String,
catalogueType: GraphQLEnum<ApolloGQL.CatalogueType>? = nil,
keySignature: GraphQLEnum<ApolloGQL.KeySignatureType>? = nil,
format: GraphQLEnum<ApolloGQL.PieceFormat>? = nil,
instrumentation: [String?]? = nil,
wikipediaUrl: String? = nil,
imslpUrl: String? = nil,
compositionYear: Int? = nil,
catalogueNumber: Int? = nil,
nickname: String? = nil,
composer: Composer? = nil,
movements: Movements? = nil
) {
self.init(_dataDict: DataDict(
data: [
"__typename": ApolloGQL.Objects.Pieces.typename,
"id": id,
"workName": workName,
"catalogueType": catalogueType,
"keySignature": keySignature,
"format": format,
"instrumentation": instrumentation,
"wikipediaUrl": wikipediaUrl,
"imslpUrl": imslpUrl,
"compositionYear": compositionYear,
"catalogueNumber": catalogueNumber,
"nickname": nickname,
"composer": composer._fieldData,
"movements": movements._fieldData,
],
fulfilledFragments: [
ObjectIdentifier(PracticeSessionDetails.Piece.self),
ObjectIdentifier(PieceDetails.self)
]
))
}

public typealias Composer = PieceDetails.Composer

public typealias Movements = PieceDetails.Movements
Expand Down
Loading