Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Change Feed #11692

Conversation

seanmcc-msft
Copy link
Member

No description provided.

@seanmcc-msft
Copy link
Member Author

TODO - Add more unit tests.

/// <summary>
/// Constructor for testing. Do not use.
/// </summary>
internal Shard(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test-only constructors weird me out, but because these object are stateful, it's challenging to get Shard, Chunk, Segment, and Change Feed into the correct starting state for the smaller unit tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather mock BlobContainerClient. Over time this might get quite out of sync from "prod" code leading to situation where we're not testing real scenarios.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

/// <summary>
/// BlobChangeFeedEvent.
/// </summary>
public class BlobChangeFeedEvent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call it BlobChangeEvent

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should consider leaving this as BlobChangeFeedEvent because we've got the same schema for changes routed through EventGrid and we don't have a Track 2 API or name for server-side logging event info yet. BlobChangeFeedEvent is a little verbose, but I don't want to take a good name that we might want for the main Blobs package someday.

@seanmcc-msft
Copy link
Member Author

Here's the API view (slightly out of date) - https://apiview.dev/Assemblies/Review/85046278a8764a02a11d65ab219b91ae

@tg-msft has recommendations for the public model names.

/// the shards indexes: 0 1 2 0 1.
/// </summary>
[Test]
public async Task GetPage()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gapra-msft, here's my Segment round robin test.

@seanmcc-msft
Copy link
Member Author

@kasobol-msft, @tg-msft, @gapra-msft, could you take a look?

Copy link
Contributor

@kasobol-msft kasobol-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good.

  • few comments left
  • quite a lot of TODOs seem to be unresolved
  • Sample tests / readmes are empty.
  • quite a lot of ignored tests.


private async Task Initalize(bool async)
{
// Check if Change Feed has been abled for this account.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

if (!changeFeedContainerExists)
{
//TODO improve this error message
throw new ArgumentException("Change Feed hasn't been enabled on this account, or is current being enabled.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current->currently

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

jsonMetaSegment = JsonDocument.Parse(blobDownloadInfo.Content);
}

//TODO what happens when _lastConsumable advances an hour?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should just default end time to "now" when change feed is requested ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just default to the last hour present in the Change Feed.

Comment on lines 201 to 212
//TODO what should we return here? Also do we really need to check this on every page?
if (_currentSegment.DateTime > _endTime)
{
return new BlobChangeFeedEventPage();
}

//TODO what should we return here? Also do we really need to check this on every page?
if (_currentSegment.DateTime > _lastConsumable)
{
return new BlobChangeFeedEventPage();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be end of iterable?

ContinuationToken = continuationToken;
}

// public BlobChangeFeedEventPage(Response raw, List<GenericRecord> data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented out code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

//TODO figure out if this is right.
public bool HasNext()
{
if (!_isInitalized)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this trigger initialize instead? or throw? Good test for this assumption might be - fetch change log for account with zero activity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we that have factories, _isInitalized isn't a thing anymore in ChangeFeed, Segment, or Shard.

/// <summary>
/// Constructor for testing. Do not use.
/// </summary>
internal Shard(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather mock BlobContainerClient. Over time this might get quite out of sync from "prod" code leading to situation where we're not testing real scenarios.

@seanmcc-msft
Copy link
Member Author

seanmcc-msft commented May 5, 2020

I'd rather mock BlobContainerClient. Over time this might get quite out of sync from "prod" code leading to situation where we're not testing real scenarios.

I am mocking the BlobContainerClient, I have that constructor to avoid super complicate test setups to mock the functionality of Shard.Initalize().

@kasobol-msft
Copy link
Contributor

I'd rather mock BlobContainerClient. Over time this might get quite out of sync from "prod" code leading to situation where we're not testing real scenarios.

I am mocking the BlobContainerClient, I have that constructor to avoid super complicate test setups to mock the functionality of Shard.Initalize().

Maybe Initialize should be converted into static factory method that returns fully initialized Shard then ?

https://www.reddit.com/r/learnprogramming/comments/a8j4cg/requiring_a_second_initialization_step_is_an/

@seanmcc-msft
Copy link
Member Author

I'd rather mock BlobContainerClient. Over time this might get quite out of sync from "prod" code leading to situation where we're not testing real scenarios.

I am mocking the BlobContainerClient, I have that constructor to avoid super complicate test setups to mock the functionality of Shard.Initalize().

Maybe Initialize should be converted into static factory method that returns fully initialized Shard then ?

https://www.reddit.com/r/learnprogramming/comments/a8j4cg/requiring_a_second_initialization_step_is_an/

Would there be any concurrency or perf issues if we switched to static factory methods? Several Shards could be being initialized at the same time.

Just for background, the Initialize() methods are separate from the constructors because I don't think we can make async calls in the constructors.

@seanmcc-msft
Copy link
Member Author

@ljian3377, please let me know if you find anything else.

@seanmcc-msft
Copy link
Member Author

I'm going to merge this, we can continue iterating in the 73 base branch.

@seanmcc-msft seanmcc-msft merged commit 029bf31 into Azure:feature/storage/stg73base Jun 3, 2020
@seanmcc-msft seanmcc-msft deleted the feature/storage/changeFeed branch June 3, 2020 23:24
cursor = JsonSerializer.Deserialize<ChangeFeedCursor>(continuation);
ValidateCursor(_containerClient, cursor);
startTime = cursor.CurrentSegmentCursor.SegmentTime;
endTime = cursor.EndTime;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better if endTime = minDate(cursor.EndTime, endTime); to allow user specify a new endTime?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, I don't provide an overload for a user to provide both a cursor and end date. I think it's conceptually simpler to have separate overloads for cursor and start/end time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants