-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-13633] [Playground] Implement method to get a default example for each SDKs #16484
[BEAM-13633] [Playground] Implement method to get a default example for each SDKs #16484
Conversation
596a9b7
to
509b7ff
Compare
"SDK_JAVA": "SDK_JAVA/MinimalWordCount", | ||
"SDK_GO": "SDK_GO/MinimalWordCount", | ||
"SDK_PYTHON": "SDK_PYTHON/WordCountWithMetrics" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -0,0 +1,5 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that this file should be in configs
folder. @KhaninArtur @ilya-kozyrev what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good place for this file. What place do you propose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now configs
folder contains files that use to configure each SDK (commands for compile
and run
steps). But this file contains constants for each SDK, so maybe it would be better to create constant values into the code and do not create one more config
file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, on the other hand, we can configure default examples from one place instead of looking for them in the code. What do you think about adding the default_example field to the corresponding .json
config? E.g. add "default_example": "SDK_GO/MinimalWordCount"
, to the SDK_GO.json
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavel-avilov could you please change the configs based on the discussion above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -173,6 +175,28 @@ func (cd *CloudStorage) GetPrecompiledObjects(ctx context.Context, targetSdk pb. | |||
return &precompiledObjects, nil | |||
} | |||
|
|||
// GetDefaultPrecompileObject returns the default precompiled object for the sdk | |||
func (cd *CloudStorage) GetDefaultPrecompileObject(ctx context.Context, targetSdk pb.Sdk, workingDir string) (*ObjectInfo, error) { | |||
defaultExampleToSdk, err := getDefaultExamplesFromJson(workingDir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
response := pb.GetDefaultPrecompiledObjectResponse{PrecompiledObject: &pb.PrecompiledObject{ | ||
CloudPath: precompiledObject.CloudPath, | ||
Name: precompiledObject.Name, | ||
Description: precompiledObject.Description, | ||
Type: precompiledObject.Type, | ||
PipelineOptions: precompiledObject.PipelineOptions, | ||
}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we return the code of the example as well? Or it will be requested from the frontend after receiving this response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this information will be enough for frontend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please also add the link
field according to this one
@@ -173,6 +175,30 @@ func (cd *CloudStorage) GetPrecompiledObjects(ctx context.Context, targetSdk pb. | |||
return &precompiledObjects, nil | |||
} | |||
|
|||
// GetDefaultPrecompileObject returns the default precompiled object for the sdk | |||
func (cd *CloudStorage) GetDefaultPrecompileObject(ctx context.Context, targetSdk pb.Sdk, workingDir string) (*ObjectInfo, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (cd *CloudStorage) GetDefaultPrecompileObject(ctx context.Context, targetSdk pb.Sdk, workingDir string) (*ObjectInfo, error) { | |
func (cd *CloudStorage) GetDefaultPrecompiledObject(ctx context.Context, targetSdk pb.Sdk, workingDir string) (*ObjectInfo, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
response := pb.GetDefaultPrecompiledObjectResponse{PrecompiledObject: &pb.PrecompiledObject{ | ||
CloudPath: precompiledObject.CloudPath, | ||
Name: precompiledObject.Name, | ||
Description: precompiledObject.Description, | ||
Type: precompiledObject.Type, | ||
PipelineOptions: precompiledObject.PipelineOptions, | ||
}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please also add the link
field according to this one
@@ -250,6 +250,6 @@ func Benchmark_GetPrecompiledObjectOutput(b *testing.B) { | |||
|
|||
func Benchmark_GetPrecompiledObject(b *testing.B) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func Benchmark_GetPrecompiledObject(b *testing.B) { | |
func Benchmark_GetPrecompiledObjectCode(b *testing.B) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
wantErr bool | ||
}{ | ||
{ | ||
name: "get object from json", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment for this test case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
wantErr: false, | ||
}, | ||
{ | ||
name: "error if wrong json path", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
func getDefaultExamplesPathFromJson(configPath string) (string, error) { | ||
file, err := ioutil.ReadFile(configPath) | ||
if err != nil { | ||
return "", err | ||
} | ||
defaultExamplePath := gjson.Get(string(file), defaultExampleKey).String() | ||
return defaultExamplePath, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe, we already read such files with Unmarshal
, can we do here the same to avoid adding a new dependency gjson
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
# Conflicts: # playground/backend/internal/api/v1/api.pb.go # playground/frontend/lib/api/v1/api.pbjson.dart
Resolve conflicts;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…into BEAM-13633_default_ex_for_sdks
// - If SDK and category are unspecified in the request, gets the whole catalog from the cache | ||
// - If there is no catalog in the cache, gets the catalog from the Storage and saves it to the cache | ||
// - If SDK or category is specified in the request, gets the specific catalog from the Storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we get from the cache only if SDK and category are unspecified? Maybe it would be better to get from the cache for each request and if cache is empty or doesn't contain required catalog/SDK objects get them from the bucket and also save them to the cache.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AydarZaynutdinov this goes from my PR, I'm working on this right now, I'll let Pavel know when it's done
// GetDefaultPrecompiledObject returns the default precompile object for sdk. | ||
func (controller *playgroundController) GetDefaultPrecompiledObject(ctx context.Context, info *pb.GetDefaultPrecompiledObjectRequest) (*pb.GetDefaultPrecompiledObjectResponse, error) { | ||
switch info.Sdk { | ||
case pb.Sdk_SDK_UNSPECIFIED, pb.Sdk_SDK_SCIO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can include SCIO as a supported SDK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
return nil, errors.InvalidArgumentError("Error during preparing", "Sdk is not implemented yet: %s", info.Sdk.String()) | ||
} | ||
|
||
defaultPrecompiledObjects, err := controller.cacheService.GetValue(ctx, uuid.Nil, cache.DefaultPrecompiledObjects) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it correct that now we have the next approach to keep precompiled objects into the cache:
nil uuid:
EXAMPLES_CATALOG:
some precompiled objects grouped by SDK.
DEFAULT_PRECOMPILED_OBJECTS
some precompiled objects grouped by SDK which also contain default: true value into the tag.
In that case, the cache contains some objects (WordCount
for example) for EXAMPLES_CATALOG
and for DEFAULT_PRECOMPILED_OBJECTS
.
Maybe we can keep all precompiled objects into one subkey EXAMPLES_CATALOG
and for GetDefaultPrecompiledObject()
method receive all objects from the cache and filter them before sending them to the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
default: | ||
return nil, errors.New("incorrect value of sdk in the environment") | ||
} | ||
} | ||
if sdk == pb.Sdk_SDK_UNSPECIFIED { | ||
return nil, errors.New("env BEAM_SDK must be specified in the environment variables") | ||
return NewBeamEnvs(sdk, nil, preparedModDir, numOfParallelJobs), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that if SDK is Unspecified then we receive nil, errors.New("incorrect value of SDK in the environment")
on the 182
line, so code on the 185
line doesn't require.
Maybe we need to remove default
case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
}) | ||
} | ||
sdkCategory.Categories = append(sdkCategory.Categories, &category) | ||
} | ||
|
||
// GetPrecompiledObjectsCatalogFromCache returns the precompiled objects catalog from the cache | ||
func GetPrecompiledObjectsCatalogFromCache(ctx context.Context, cacheService cache.Cache) ([]*pb.Categories, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets rename it to GetCatalogFromCache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
value, err := cacheService.GetValue(ctx, ExamplesDataPipelineId, cache.ExamplesCatalog) | ||
if err != nil { | ||
logger.Errorf("%s: cache.GetValue: %s\n", ExamplesDataPipelineId, err.Error()) | ||
return nil, err | ||
} | ||
catalog, converted := value.([]*pb.Categories) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test this part with Redis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes of course
} | ||
|
||
// GetPrecompiledObjectsCatalogFromStorage returns the precompiled objects catalog from the cloud storage | ||
func GetPrecompiledObjectsCatalogFromStorage(ctx context.Context, sdk pb.Sdk, category string) ([]*pb.Categories, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets rename it to GetCatalogFromStorage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This goes from [BEAM-13632]
@pavel-avilov please resolve conflicts and address comment |
# Conflicts: # playground/api/v1/api.proto # playground/backend/cmd/server/controller.go # playground/backend/internal/api/v1/api.pb.go # playground/backend/internal/cloud_bucket/precompiled_objects.go # playground/backend/internal/utils/precompiled_objects_utils.go # playground/backend/internal/utils/precompiled_objects_utils_test.go # playground/frontend/lib/api/v1/api.pb.dart # playground/frontend/lib/api/v1/api.pbjson.dart
Change saving default precompiled objects to the cache
Change logic of saving and receiving info about default precompiled objects
Separate for each sdk
# Conflicts: # playground/api/v1/api.proto # playground/backend/internal/api/v1/api.pb.go # playground/backend/internal/cloud_bucket/precompiled_objects.go # playground/backend/internal/utils/precompiled_objects_utils_test.go # playground/frontend/lib/api/v1/api.pb.dart # playground/frontend/lib/api/v1/api.pbjson.dart # playground/infrastructure/api/v1/api_pb2.py # playground/infrastructure/helper.py
regenerate proto files
source_file=os.path.join(Config.TEMP_FOLDER, cloud_path), | ||
destination_blob_name=cloud_path) | ||
|
||
def _write_default_examples_paths_to_local_fs(self, paths: {}) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to annotate the argument paths
as a set? If so, it is better to do so explicitly like paths: set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have unit tests for this code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, but paths: set - wrong annotation for our Python version, Set from Typing package - the right way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
source_file=os.path.join(Config.TEMP_FOLDER, cloud_path), | ||
destination_blob_name=cloud_path) | ||
|
||
def _write_default_examples_paths_to_local_fs(self, paths: {}) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, but paths: set - wrong annotation for our Python version, Set from Typing package - the right way.
|
||
local_path = os.path.join(path_to_file, Config.DEFAULT_PRECOMPILED_OBJECTS) | ||
content = json.dumps(paths) | ||
with open(local_path, "x", encoding="utf-8") as file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why x? maybe "w" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
if example.tag.default_example: | ||
default_examples_paths[Sdk.Name(example.sdk)] = Path( | ||
[*file_names].pop()).parent.__str__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please do not use str for casting, cast through str(object to cast)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Add test;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a small suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@pabloem everything is ready for review |
@pabloem - What is the next step on this PR? |
sorry about the delay. Taking a quick look... |
indeed this LGTM! sorry, can you rebase please? |
# Conflicts: # playground/backend/cmd/server/controller.go # playground/backend/internal/cloud_bucket/precompiled_objects.go # playground/backend/internal/cloud_bucket/precompiled_objects_test.go
@pabloem Done |
…ent method to get a default example for each SDKs * Implement method to get a default example for each SDKs * Add error handling * Added saving of precompiled objects catalog to cache at the server startup * Added caching of the catalog only in case of unspecified SDK * Update regarding comments * Update regarding comments * Simplified logging regarding comment * Get defaultExamplePath from the corresponding config * Refactoring code * Add the `link` field to response * Remove gjson; Resolve conflicts; * Refactoring code * Getting default precompiled object from cache * Refactoring code * Added saving of precompiled objects catalog to cache at the server startup * Added caching of the catalog only in case of unspecified SDK * Update regarding comments * Update regarding comments * Simplified logging regarding comment * Updates regarding comments * Update for environment_service_test.go * Get default example from catalog * GetCatalogFromCacheOrStorage method * Update licenses * Update licenses; Resolve conflicts; * [BEAM-13633][Playground] Change saving default precompiled objects to the cache * [BEAM-13633][Playground] Change logic of saving and receiving info about default precompiled objects * [BEAM-13633][Playground] Separate for each sdk * [BEAM-13633][Playground] regenerate proto files * Add code of the default example to response * Revert "Add code of the default example to response" This reverts commit da6baa0. * Refactoring code * Refactoring code; Add test; * Edit commentaries * Refactoring code * Add bucket name to methods Co-authored-by: Artur Khanin <[email protected]> Co-authored-by: AydarZaynutdinov <[email protected]> Co-authored-by: Pavel Avilov <pavel.avilov>
…r transaction boundaries and transaction ID ordering. * Added integration test for transaction boundaries and transaction ID ordering. Made small fixes in ordered by key integration test. * [BEAM-9150] Fix beam_PostRelease_Python_Candidate (python RC validation scripts) (#16955) * Use default context output rather than outputWithTimestamp for ElasticsearchIO * Palo Alto case study - fix link * [BEAM-12777] Removed current docs version redirect * Merge pull request #16850: [BEAM-11205] Upgrade Libraries BOM dependencies to 24.3.0 * Update GCP Libraries BOM version to 24.3.0 * Update associated dependencies * Merge pull request #16484 from [BEAM-13633] [Playground] Implement method to get a default example for each SDKs * Implement method to get a default example for each SDKs * Add error handling * Added saving of precompiled objects catalog to cache at the server startup * Added caching of the catalog only in case of unspecified SDK * Update regarding comments * Update regarding comments * Simplified logging regarding comment * Get defaultExamplePath from the corresponding config * Refactoring code * Add the `link` field to response * Remove gjson; Resolve conflicts; * Refactoring code * Getting default precompiled object from cache * Refactoring code * Added saving of precompiled objects catalog to cache at the server startup * Added caching of the catalog only in case of unspecified SDK * Update regarding comments * Update regarding comments * Simplified logging regarding comment * Updates regarding comments * Update for environment_service_test.go * Get default example from catalog * GetCatalogFromCacheOrStorage method * Update licenses * Update licenses; Resolve conflicts; * [BEAM-13633][Playground] Change saving default precompiled objects to the cache * [BEAM-13633][Playground] Change logic of saving and receiving info about default precompiled objects * [BEAM-13633][Playground] Separate for each sdk * [BEAM-13633][Playground] regenerate proto files * Add code of the default example to response * Revert "Add code of the default example to response" This reverts commit da6baa0. * Refactoring code * Refactoring code; Add test; * Edit commentaries * Refactoring code * Add bucket name to methods Co-authored-by: Artur Khanin <[email protected]> Co-authored-by: AydarZaynutdinov <[email protected]> Co-authored-by: Pavel Avilov <pavel.avilov> * Add 2022 events blog post (#16975) * Clean up Go formatter suggestions (#16973) * [BEAM-14012] Add go fmt to Github Actions (#16978) * [BEAM-13911] Add basic tests to Go direct runner. (#16979) * [BEAM-13960] Add support for more types when converting from between row and proto (#16875) * Adding schema support. * Addressing feedback. * Bump org.mongodb:mongo-java-driver to 3.12.10 * [BEAM-13973] Link Dataproc Flink master URLs to the InteractiveRunner when FlinkRunner is used (#16904) * [BEAM-13925] Turn pr bot on for go prs (#16984) * [BEAM-13964] Bump kotlin to 1.6.x (#16882) * [BEAM-13964] Bump kotlin to 1.6.x * [BEAM-13964] Bump kotlin to 1.6.x * [BEAM-13964] fix warnings in Kotlin compilation * Skipping flaky sad-path tests for Spanner changestreams * Merge pull request #16906: [BEAM-13974] Handle idle Storage Api streams * Merge pull request #16562 from [BEAM-13051][D] Enable pylint warnings (no-name-in-module/no-value-for-parameter) * [BEAM-13051] Pylint no-name-in-module and no-value-for-parameter warnings enabled * [BEAM-13051] Fixed no-value-for-parameter warning for missing default values * [BEAM-13051] Fixed parameters warnings * [BEAM-13925] A couple small pr-bot bug fixes (#16996) * [BEAM-14029] Add getter, setter for target maven repo (#16995) * [BEAM-13903] Improve coverage of metricsx package (#16994) * [BEAM-13892] Improve coverage of avroio package (#16990) * [adhoc] Prepare aws2 ClientConfiguration for json serialization and cleanup AWS Module (#16894) * [adhoc] Prepare aws2 ClientConfiguration and related classes for json serialization and cleanup AWS Module * Merge pull request #16879 from [BEAM-12164] Add javadocs to SpannerConfig * Add tests and config for retry * lint * add tests * lint * Delete tests not passing * Rebase on apache beam master * review changes * review changes * add javadocs to SpannerConfig * revert * add full stops * [Cleanup] Update pre-v2 go package references (#17002) * [BEAM-13885] Add unit tests to window package (#16971) * Merge pull request #16891 from [BEAM-13872] [Playground] Increase test coverage for the code_processing package * Increase test coverage for the code_processing package * Refactoring code * Add test cases with mock cache * Add test for processCompileSuccess method * Update test names * Refactoring code * Merge pull request #16912 from [BEAM-13878] [Playground] Increase test coverage for the fs_tool package * Increase test coverage for the fs_tool package * Rename folder * Remove useless variable * Update test names * Merge pull request #16946 from [BEAM-13873] [Playground] Increase test coverage for the environment package * Increase test coverage for the environment package * Update test names * Refactoring code * Add bucket name to method * [BEAM-13999] playground - support vertical orientation for graph * [BEAM-13951] Update mass_comment.py list of Run commands (#16889) * BEAM-13951: Sort run command list * BEAM-13951: Update list * fixup! BEAM-13951: Update list * [BEAM-10652] Allow Clustering without Partition in BigQuery (#16578) * [BEAM-10652] removed check that blocked clustering without partitioning * [BEAM-10652] allow clustering without requiring partition * newline * added needed null * remove testClusteringThrowsWithoutPartitioning * update clustering * formatting * now compiles * passes spotless * update doc * focus on single test * spotless * run all ITs * spotless * testing with time partitioning * checking * set clustering independant of partitioning * remove timepart from it * spotless * removed test * added TODO * removed block of unneded code/comment * remove override to v3 coder * Spotless cleanup * re-add override to v3 coder * spotless * adding checksum ( wrong value ) * added needed query var * use tableName as var * DATASET NAME * project name in query * update query * change tests * remove unneeded imports * remove rest of forgotten * add rows * 16000 bytes * bigint * streaming test * spotless * methods * end stream * stream method and naming * nostream * streaming * streamingoptions * without streaming example * string column instead of date -- related to BEAM-13753 * mor strings * spotless * revert, only DEFAULT and FILE_LOADS * [BEAM-13857] Add K:V flags for expansion service jars and addresses to Go ITs. (#16908) Adds functionality for running jars to the Go integration test framework, and uses this functionality to implement handling of K:V flags for providing expansion service jars and addresses to the test framework. This means that tests can simply get the address of an expansion service with the appropriate label, and this feature will handle running a jar if necessary, or just using the passed in endpoint otherwise. * BEAM-14011 fix s3 filesystem multipart copy * Merge pull request #16842 from [BEAM-13932][Playground] Container's user privileges * [BEAM-13932][Playground] Change Dockerfiles * [BEAM-13932][Playground] Update proxy and permissions for the container's user * [BEAM-13932][Playground] Update permissions for the container's user for scio * Doc updates and blog post for 2.37.0 (#16887) * Doc updates and blog post for 2.37.0 * Add BEAM-13980 to known issues * Update dates * Drop known issue (fix cherrypicked) * Add license * Add missing # * Remove resolved issue in docs + update class path on sample (#17018) * [BEAM-14016] Fixed flaky postcommit test (#17009) Fixed SpannerWriteIntegrationTest.test_spanner_update by fixing the metric exporter usage in spannerio. * [BEAM-13925] months in date constructor are 0 indexed * [BEAM-13947] Add split() and rsplit(), non-deferred column operations on categorical columns (#16677) * Add split/rsplit; Need to refactor regex * Support Regex; Refactor tests * Remove debugger * fix grammar * Fix passing regex arg * Reorder imports * Address PR comments; Simplify kwargs * Simplify getting columns for split_cat * Update doctests to skip expand=True operations * Fix missing doctest * py: Import beam plugins before starting SdkHarness * BEAM-14026 - Fixes bug related to Unnesting nested rows in an array (#16988) * Suggested changes to handle nested row in an array * Beam-14026 Suggested changes to handle nested row in an array * Beam-14026 Enhanced by segregating the code from getBaseValues enhanced test case and example. * Beam-14026 The code is moved from Row to avoid impact to the public interface. The code is moved to BeamUnnestRel.java since its the caller class. The Example code was duplicate, hence dropped. build.gradle updated with the removal of example code. * Remove resolved issue in notebook * Bump numpy bound to include 1.22 and regenerate container deps. * [BEAM-13925] Add ability to get metrics on pr-bot performance (#16985) * Add script to get metrics on pr-bot performance * Respond to feedback * fix bad condition * [BEAM-11085] Test that windows are correctly observed in DoFns * Give pr bot write permissions on pr update * Adding a logical type for Schemas using proto serialization. (#16940) * BEAM-13765 missing PAssert methods (#16668) * [BEAM-13909] improve coverage of Provision package (#17014) * improve coverage of provision package * updated comments * [BEAM-14050] Update taxi.go example instructions * Merge pull request #17027: [BEAM-11205] Upgrade GCP Libraries BOM dependencies to 24.4.0 * [BEAM-13709] Inconsistent behavior when parsing boolean flags across different APIs in Python SDK (#16929) * Update dataflow API client. * Instructions for updating apitools generated files. * [BEAM-10976] Bundle finalization: Harness and some exec changes (#16980) * Bundle finalization harness side changes * Add testing * Iterate over pardos directly * Track bundlefinalizer in plan.go not pardo * Remove outdated test * Fix pointer issue * Update todos to reference jiras * Cleanup from feedback * Doc nit Co-authored-by: Daniel Oliveira <[email protected]> * GetExpirationTime comment Co-authored-by: github-actions <[email protected]> Co-authored-by: Daniel Oliveira <[email protected]> * Merge pull request #16976 from [BEAM-14010] [Website] Add Playground section to the Home page * [BEAM-14010] [Website] Add Playground section to the Home page * Update button to "Try Playground" Co-authored-by: Aydar Zainutdinov <[email protected]> * [BEAM-14010] [Website] change button name * [BEAM-14010] [Website] align header to center * [BEAM-14010] [Website] change link Co-authored-by: Alex Kosolapov <[email protected]> Co-authored-by: Aydar Zainutdinov <[email protected]> * [BEAM-12447] Upgrade cloud build client and add/cleanup options (#17032) * Merge pull request #17036 from [BEAM-12164] Convert all static instances to be transient in the connector in order to enable concurrent testing * Convert all static instances to be transient in the connector in order to enable concurrent testing * Initialized fields to null * nullness * Suppress uninitialized warnings * Remove resetting dao factory fields in SpannerChangeStreamErrorTest.java * Add validation package * fix variable reference (#16991) * Committed changes * Print more logging * More logging * Made pipelines streaming * Made small fixes * Small fixes * Ran spotless Apply Co-authored-by: emily <[email protected]> Co-authored-by: egalpin <[email protected]> Co-authored-by: Aydar Farrakhov <[email protected]> Co-authored-by: Miguel Hernandez <[email protected]> Co-authored-by: Benjamin Gonzalez <[email protected]> Co-authored-by: Pavel Avilov <[email protected]> Co-authored-by: Artur Khanin <[email protected]> Co-authored-by: AydarZaynutdinov <[email protected]> Co-authored-by: Ahmet Altay <[email protected]> Co-authored-by: Jack McCluskey <[email protected]> Co-authored-by: Robert Burke <[email protected]> Co-authored-by: laraschmidt <[email protected]> Co-authored-by: Alexey Romanenko <[email protected]> Co-authored-by: Victor <[email protected]> Co-authored-by: Danny McCormick <[email protected]> Co-authored-by: Masato Nakamura <[email protected]> Co-authored-by: Pablo Estrada <[email protected]> Co-authored-by: reuvenlax <[email protected]> Co-authored-by: Miguel Hernandez <[email protected]> Co-authored-by: Moritz Mack <[email protected]> Co-authored-by: Zoe <[email protected]> Co-authored-by: Brian Hulette <[email protected]> Co-authored-by: brucearctor <[email protected]> Co-authored-by: Daniel Oliveira <[email protected]> Co-authored-by: sp029619 <[email protected]> Co-authored-by: David Cavazos <[email protected]> Co-authored-by: Ning Kang <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Andy Ye <[email protected]> Co-authored-by: Rahul Iyer <[email protected]> Co-authored-by: abhijeet-lele <[email protected]> Co-authored-by: Valentyn Tymofieiev <[email protected]> Co-authored-by: Marcin Kuthan <[email protected]> Co-authored-by: Ritesh Ghorse <[email protected]> Co-authored-by: Jack McCluskey <[email protected]> Co-authored-by: ansh0l <[email protected]> Co-authored-by: Anand Inguva <[email protected]> Co-authored-by: Robert Bradshaw <[email protected]> Co-authored-by: Daniel Oliveira <[email protected]> Co-authored-by: bullet03 <[email protected]> Co-authored-by: Alex Kosolapov <[email protected]> Co-authored-by: Yichi Zhang <[email protected]> Co-authored-by: Nancy Xu <[email protected]>
[BEAM-13633]
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
ValidatesRunner
compliance status (on master branch)Examples testing status on various runners
Post-Commit SDK/Transform Integration Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.