-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(era_manager): add era support + pre-fetching #1375
feat(era_manager): add era support + pre-fetching #1375
Conversation
06190da
to
6c543f2
Compare
e2store/src/utils.rs
Outdated
const ERA_DIR_URL: &str = "https://mainnet.era.nimbus.team/"; | ||
|
||
/// Fetches era file download links | ||
pub async fn get_era_file_download_links(http_client: &Client) -> anyhow::Result<Vec<String>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Are we sure there is no api that we can use that returns better structured content (e.g. json), instead of parsing HTML ourself?
nit: Maybe return HashMap<u64, String>
, where key is the epoch number (keep the logic of processing this file in one place)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Are we sure there is no api that we can use that returns better structured content (e.g. json), instead of parsing HTML ourself?
Not that I am aware of
nit: Maybe return HashMap<u64, String>, where key is the epoch number (keep the logic of processing this file in one place)?
sure
e2store/src/era.rs
Outdated
@@ -80,6 +80,25 @@ impl Era { | |||
Ok(era_state.state) | |||
} | |||
|
|||
/// Iterate over beacon blocks. | |||
pub fn iter_blocks(raw_era1: Vec<u8>) -> impl Iterator<Item = CompressedSignedBeaconBlock> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should return iterator over Result<CompressedSignedBeaconBlock>
.
I know it doesn't look that good, but it is a standard way of doing things (as far as I can tell), for example: std::io::Lines implements Iterator<Item=Result<String, std::io::Error>>
e2store/src/era.rs
Outdated
@@ -80,6 +80,25 @@ impl Era { | |||
Ok(era_state.state) | |||
} | |||
|
|||
/// Iterate over beacon blocks. | |||
pub fn iter_blocks(raw_era1: Vec<u8>) -> impl Iterator<Item = CompressedSignedBeaconBlock> { | |||
let file = E2StoreMemory::deserialize(&raw_era1).expect("invalid era1 file"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused here. Are these era or era1 files? Why are we processing era1
files in era.rs
?
e2store/src/era.rs
Outdated
.expect("missing block index entry") | ||
.slot_index; | ||
|
||
let mut next_slot = block_index.starting_slot; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that this logic is already implemented at line 52. Maybe extract it into separate function?
Also question about this logic, because I'm not sure I have all the knowledge and I would like to make sure I understand it. We can't always increment next_slot
by one because slot can be skipped. And we need to know slot index in order to figure out the fork (in order to parse block correctly). What happens if the slot that is right before the new fork is skipped? Wouldn't this logic fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic can only break during the transition to the next fork. This is a best effort approach. Currently it should work for the 3 forks we are supporting, but essentially because we don't know what the next epoch is we have to guess, every time a new fork happens we just have to configure it to work, as era files shouldn't change I would think at least.
What happens if the slot that is right before the new fork is skipped?
This shouldn't matter as long as we fix it within a 6 months window of the fork activating. The reason being is we can get the last 6 months of blocks from the CL itself.
So even if we get it wrong when a new fork is launched, we just have to fix it within 6 months so not the biggest deal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I understand the structure of the era file better, there is proper way of doing this, by using information from the SlotIndexBlockEntry
. The indices
field can tell us which slots are present and which ones are not (value 0 means it's absent).
I think something like this would work (but it would be good to be tested first):
block_index.indices
.iter()
.enumerate()
.filter_map(|(i, index)| if index != 0 { Some(block_index.starting_slot + i) } else { None })
.zip_eq(file.entries) // zip_eq from itertools crate
.map(|(slot, entry)| {
let fork = get_beacon_fork(slot);
CompressedSignedBeaconBlock::try_from(&entry, fork)
})
.collect::<Result<Vec<_>, _>>()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation I did is a little different then the suggestion, but it follows the same idea
@@ -174,6 +174,17 @@ impl SignedBeaconBlock { | |||
SignedBeaconBlock::Deneb(block) => block.message.slot, | |||
} | |||
} | |||
|
|||
/// Returns execution block number. | |||
pub fn execution_block_number(&self) -> u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it feels a bit weird to be this specific and return the execution block number directly from SignedBeaconBlock
. Maybe make this one returns ExecutionPayload
and caller can get block number directly from there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would make us do a match on ExecutionPayload
every time we want to get the block number, because each fork has its own ExecutionPayload
type.
match executionpayload {
bellatrix => number
capella => number
deb => number
}
I think the current implementation is the cleanest, but if this is a major concern we can discuss it more
trin-execution/src/era/manager.rs
Outdated
async fn fetch_era_file(&mut self, block_number: u64) -> anyhow::Result<ProcessedEra> { | ||
info!("Fetching the next era file"); | ||
|
||
// If the current is already downloaded, return it, else download the current one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I find this comment a bit confusing in regard to the logic. Shouldn't it be something like:
If the next_era is already downloading, then wait and return it. Otherwise, download it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly, I tried to rewrite the comment to make it more clear
.expect("to be able to download compressed beacon block"); | ||
let entry = Entry::deserialize(&compressed_beacon_block)?; | ||
|
||
// download slot_index_state to get the starting slot so we get the correct fork |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked more into spec, and I believe that everything after this point is unnecessary and actually wrong.
Here are stuff that I found:
- We don't need to fetch
Content-Length
header first, because according to Range documentation we can useRange: bytes=-32
to fetch last 32 bytes (I tried it and it works) slot_index_state.slot_index.starting_slot
(line 133) will give us slot number of theCompressedBeaconState entry
which corresponds to the slot after executing all blocks for this era file, so we would use wrong fork for decoding first block of this era file (when fork happens in the middle of era file)- (and most importantly) I believe that we can calculate slot number that belongs to the first block of era file directly! If I'm not mistaken, each era file has exactly 8192 slots (but it could be less blocks). The
mainnet-00000-...
has no slots/blocks and is the exception. So if we downloading first block from the era filemainnet-xxxxx-...
the slot number of the first block should be(x-1)*8192
, which we can use to figure out the fork and decode the first block.- there is small caveat to this. If enough blocks are skipped at the start of era (and blocks don't actually exist), we might now be in the following fork and this would fail (but we don't have to optimize for this case)
So all in all, I believe that we only need to pass the era epoch number, download first block (as you did above), calculate the slot number and fork based on era epoch number, and decode the block.
Err(anyhow::anyhow!("Block not found in any era file")) | ||
} | ||
|
||
fn does_era_contain_block(era: &Era, block_number: u64) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just check if block_number is in the range of execution_block_number
of first and last block? Also, should this maybe be in era.rs
file?
.expect("to be able to parse epoch index"); | ||
|
||
while start_epoch_index <= end_epoch_index { | ||
let mid = end_epoch_index + (start_epoch_index - end_epoch_index) / 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's probably cleaner if this would be:
let mid = (start_epoch_index + end_epoch_index) / 2;
I have an idea on how this can be optimized (meaning we can guess mid point more accurately), but we can leave that for a followup pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't let mid = (start_epoch_index + end_epoch_index) / 2;
bad practice https://research.google/blog/extra-extra-read-all-about-it-nearly-all-binary-searches-and-mergesorts-are-broken/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That matters only if start_epoch_index + end_epoch_index
can go overflow, which shouldn't happen in our case.
I'm fine with your approach as well, no big deal. But, I think you also didn't do it correctly. I think variables are u64
so your formula should panic, right? Shouldn't it be start + (end - start) / 2
?
let mid_block_number = mid_block.block.execution_block_number(); | ||
|
||
// minimize the worse case by fetching the first block number of the after mid .era | ||
let mid_plus_one_block = EraBinarySearch::download_first_beacon_block_from_era( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not work if epoch we are looking for is the last epoch, or later (epoch not yet created).
But I think that we can avoid looking for mid_plus_one
every loop iteration. For example, we can simply check where block_number
is in relationship to mid_block_number
and mid_block_number+8192
, something like this:
(start_epoch_index, end_epoch_index) = match block_number {
..=(mid_block_number - 1) => (start_epoch_index, mid - 1),
(mid_block_number + 8192).. => (mid + 1, end_epoch_index),
_ => {
// mid is a likely candidate, but we are not sure
// we should peek at the following epochs until we are sure
let mut target = mid;
while target < <last_epoch> {
let next_block = download_first_beacon_block_from_era(target + 1);
if next_block.block.execution_block_number() < block_number {
// target is our epoch
todo!("download and return target")
} else {
target += 1;
}
}
// we are at the last epoch. we should download it and try it
// note: this can probably be optimized by downloading the SlotIndexBlock and verifying,
// but we can ignore that optimization
todo!("download and try")
},
}
This is just an example. I'm sure it can be made a bit more readable and maybe some functionality can be extracted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that I have to work around the issue when mid_block_number == 0
(first epoch with execution block). I think this can be as simple as
if mid_block_number == 0 {
mid_block_number = MERGE_BLOCK;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not work if epoch we are looking for is the last epoch, or later (epoch not yet created).
For blocks from the last 6 months we should just be able to get them from the CL client. So if we are at the last epoch we can just throw an error to say continue from CL client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning error is fine. But it will panic with the current code when mid is equal to the last epoch (which is valid case). It will panic here:
let mid_plus_one_block = EraBinarySearch::download_first_beacon_block_from_era(
era_links[mid as usize + 1].clone(),
http_client.clone(),
)
because mid as usize + 1
is non-existing index of era_links
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a if case to through an era in this case
d5d8849
to
2ce7d59
Compare
@morph-dev ready for another look |
e2store/src/era.rs
Outdated
@@ -49,12 +49,28 @@ impl Era { | |||
let slot_index_state = SlotIndexStateEntry::try_from(&file.entries[entries_length - 1])?; | |||
|
|||
// Iterate over the block entries. Skip the first and last 3 entries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think this comment is no longer relevant. if you want, put it down close to ensure!
to explain why there is - 4
, but rephrase it in that case
e2store/src/era.rs
Outdated
let block_index = | ||
SlotIndexBlockEntry::try_from(&file.entries[entries_length - 2])?.slot_index; | ||
|
||
let slot_indexes = block_index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since this is repeated functionality, maybe it can be extracted into separate function?
e2store/src/utils.rs
Outdated
const ERA_DIR_URL: &str = "https://mainnet.era.nimbus.team/"; | ||
|
||
/// Fetches era file download links | ||
pub async fn get_era_file_download_links( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that there is much difference between this and the function above, other than:
- they are using different url
- era1 version is checking that length matches and is shuffling them at the end
I would make separate function that accepts url as parameter, fetches them and returns Result<HashMap<u64, String>>
.
Then I would:
- create 2 functions:
get_era_files
andget_era1_files
that call it and return the same type (era1 would also check the length) get_shuffled_era1_files
would callget_era1_files
(epoch_index, format!("{ERA_DIR_URL}{href}")) | ||
}) | ||
.collect(); | ||
Ok(era_files) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: what do you think about checking that keys are starting from 0 and consecutive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what you mean do you want me to loop from 0 to max era count and see if all exist?
Because since we are using a hashmap, the the values are stored in the positions of the hash of the keys which wouldn't be in sorted order, if we were to iterate over the buckets in order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think loop and checking is simple enough. Something like:
ensure!(
(0..(era_files.len()).all(|epoch| era_files.contains(&epoch)),
"Epoch indices are not starting from zero or not consecutive",
);
You can also calculate min and max and check if they are as desired (this still requires looping, so no saving performance wise).
}) | ||
} | ||
|
||
pub async fn download_raw_era(era_path: String, http_client: Client) -> anyhow::Result<Vec<u8>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems to not be era specific, it just downloads files with retry logic. Maybe rename function and move it to more general place (e.g. trin-utils
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to limit the amount of workspace crates trin-execution
pulls from in the case it becomes its own project.
I am not aware why trin would want to download a file from http.
Also ideally this retry logic is geared for downloading era files
.expect("to be able to parse epoch index"); | ||
|
||
while start_epoch_index <= end_epoch_index { | ||
let mid = end_epoch_index + (start_epoch_index - end_epoch_index) / 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That matters only if start_epoch_index + end_epoch_index
can go overflow, which shouldn't happen in our case.
I'm fine with your approach as well, no big deal. But, I think you also didn't do it correctly. I think variables are u64
so your formula should panic, right? Shouldn't it be start + (end - start) / 2
?
.expect("to be able to download compressed beacon block"); | ||
let entry = Entry::deserialize(&compressed_beacon_block)?; | ||
|
||
let slot_index = (era_index - 1) * 8192; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you have this constant in era1.rs
.
You can even extract this as mini function there (e.g. pub fn start_slot_index(...)
)
block_number: u64, | ||
) -> anyhow::Result<ProcessedEra> { | ||
let era_links = get_era_file_download_links(&http_client).await?; | ||
let last_epoch_index = era_links.len() - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe move last_epoch_index few lines below: let last_epoch_index = end_epoch_index;
http_client: Client, | ||
block_number: u64, | ||
) -> anyhow::Result<ProcessedEra> { | ||
let era_links = get_era_file_download_links(&http_client).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we know the first execution block that is in era file, right?
I would do a check that block_number
is not less than that.
let mid_block_number = mid_block.block.execution_block_number(); | ||
|
||
if mid + 1 > last_epoch_index as u64 { | ||
return Err(anyhow::anyhow!("Block not found in any era file")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't true. It can be in the mid_block
, and we could return it.
@morph-dev ready for another look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there. Few, minor comments, mostly in EraManager.
e2store/src/era.rs
Outdated
}) | ||
.collect::<Vec<u64>>(); | ||
|
||
// an era file should has 4 entries which are not blocks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// an era file should has 4 entries which are not blocks | |
// an era file has 4 entries which are not blocks |
|
||
pub async fn get_era1_files(http_client: &Client) -> anyhow::Result<HashMap<u64, String>> { | ||
let era1_files = download_era_links(http_client, ERA1_DIR_URL).await?; | ||
ensure!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: similar to era files, we can also check that all keys are in the range of (0..ERA1_FILE_COUNT)
trin-execution/src/era/manager.rs
Outdated
current_era: Option<ProcessedEra>, | ||
next_era: Option<JoinHandle<ProcessedEra>>, | ||
http_client: Client, | ||
era1_files: Vec<String>, | ||
} | ||
|
||
impl EraManager { | ||
pub async fn new() -> anyhow::Result<Self> { | ||
pub async fn new(starting_block: u64) -> anyhow::Result<Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code, I no longer like the idea of the get_current_block()
. It seems to be used only in the case of the starting block. Also, I don't think constructor should be async and blocking. Instead, I think we can:
- Constructor would leave
self.current_era
asNone
, but would start fetching required era (based onstarting_block
) and store its handle inself.next_era
. - Remove
get_current_block
- The first call to
get_next_block
would return block with indexstarting_block
(that is passed in the constructor) - Rename
self.current_block_number
toself.next_block_number
(also rename getter).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code, I no longer like the idea of the get_current_block(). It seems to be used only in the case of the starting block.
Well none of this logic is used in this PR, I will remove it for now and if we need it in the future we can add it back
trin-execution/src/era/manager.rs
Outdated
current_era: Option<ProcessedEra>, | ||
next_era: Option<JoinHandle<ProcessedEra>>, | ||
http_client: Client, | ||
era1_files: Vec<String>, | ||
} | ||
|
||
impl EraManager { | ||
pub async fn new() -> anyhow::Result<Self> { | ||
pub async fn new(starting_block: u64) -> anyhow::Result<Self> { | ||
let http_client: Client = Config::new() | ||
.add_header("Content-Type", "application/xml") | ||
.expect("to be able to add header") | ||
.try_into()?; | ||
let era1_files = get_shuffled_era1_files(&http_client).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now we should probably use get_era1_files
and keep it as hashmap. This would remove looking up the right url from below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now we should probably use
get_era1_files
and keep it as hashmap. This would remove looking up the right url from below.
my concern is what if a new epoch is added, but I guess it doesn't matter since we will continue syncing from the running CL client anyways, so it doesn't matter if we have a few new era files or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't thinking of storing era_files
, just era1_files
like before. But store them as HashMap.
With that being said, I'm fine if we also store era_files
as well, since they are not frequently updated and as you said we can rely on CL clients for the most recent blocks anyway.
Additionally, we can store era_files
and update them only when we need something more recent than the one we have, but we can leave this for future PR (maybe create an issue to track this).
trin-execution/src/era/manager.rs
Outdated
Ok(processed_era.get_block(self.current_block_number)) | ||
} | ||
|
||
async fn fetch_era_file_link( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of passing block_number
, we could pass EraType
instead. That would make the meaning of epoch_index
a bit more explicit.
It can be simplified if self.era1_file
is hash map (which it can be).
Also, this function can panic, and I don't think this is desired. Especially if we are trying to fetch era
that doesn't exist yet (e.g. we are processing latest Era file, we want to start pre-processing next one, which doesn't exist, so we panic). So I would make it return Result<Option<String>>
and handle it outside.
trin-execution/src/era/manager.rs
Outdated
}; | ||
|
||
// Download the next era file | ||
let mut next_epoch_index = current_era.epoch_index + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this whole block (lines 108:117) a bit hard to read and understand. I think we can make it very simple:
let mut next_era_type = current_era.era_type;
let mut next_epoch_index = current_era.epoch_index + 1;
// Handle transition from era1 to era
if next_era_type == EraType::Era1 && next_epoch_index == ERA1_FILE_COUNT {
next_era_type = EraType::Era;
next_epoch_index = FIRST_ERA_EPOCH_WITH_EXECUTION_PAYLOAD;
}
let next_era_path = self.fetch_era_file_link(next_era_type, next_epoch_index).await?;
@morph-dev ready for another review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look great! Few more smaller comments. Thank you for the patience
trin-execution/src/era/manager.rs
Outdated
current_era: Option<ProcessedEra>, | ||
next_era: Option<JoinHandle<ProcessedEra>>, | ||
http_client: Client, | ||
era1_files: Vec<String>, | ||
} | ||
|
||
impl EraManager { | ||
pub async fn new() -> anyhow::Result<Self> { | ||
pub async fn new(starting_block: u64) -> anyhow::Result<Self> { | ||
let http_client: Client = Config::new() | ||
.add_header("Content-Type", "application/xml") | ||
.expect("to be able to add header") | ||
.try_into()?; | ||
let era1_files = get_shuffled_era1_files(&http_client).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't thinking of storing era_files
, just era1_files
like before. But store them as HashMap.
With that being said, I'm fine if we also store era_files
as well, since they are not frequently updated and as you said we can rely on CL clients for the most recent blocks anyway.
Additionally, we can store era_files
and update them only when we need something more recent than the one we have, but we can leave this for future PR (maybe create an issue to track this).
self.current_era.as_ref().expect("current_era to be some") | ||
} | ||
_ => { | ||
let current_era = self.fetch_era_file().await?; | ||
self.current_era.insert(current_era) | ||
} | ||
}; | ||
Ok(processed_era.get_block(self.current_block_number)) | ||
} | ||
let block = processed_era.get_block(self.next_block_number); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It should, but I think we can't be sure that processed_era
actually contains the next_block_number
(because of the _ => {
case above).
It's better to check than panic (in my opinion), so I would add ensure!
just before this line.
trin-execution/src/era/manager.rs
Outdated
let era_path = self.fetch_era_file_link(block_number, epoch_index).await?; | ||
let raw_era1 = download_raw_era(era_path, self.http_client.clone()).await?; | ||
let Some(era1_path) = era1_files.get(&epoch_index).cloned() else { | ||
return Err(anyhow::anyhow!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think you can replace return Err(anyhow::anyhow!(
with anyhow::bail!(
@@ -370,6 +370,8 @@ mod tests { | |||
let raw_era1 = fs::read("../test_assets/era1/mainnet-00000-5ec1ffb8.era1").unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should be using both raw file and EraManager (it should be one or the other).
I would say we shouldn't use EraManager in tests if we can avoid it (because EraManager would download from public sources).
I think we can obtain ProcessedEra directly from raw_era1
, and remove usage of EraManager in this test.
era::utils::process_era1_file(raw_era1, /* epoch_index= */ 0)
trin-execution/src/era/manager.rs
Outdated
EraType::Era1 => self.era1_files.get(&next_epoch_index).cloned(), | ||
EraType::Era => self.era_files.get(&next_epoch_index).cloned(), | ||
}) else { | ||
return Err(anyhow::anyhow!("Unable to find next era file's path: index {next_epoch_index} type {next_era_type:?}")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should return error here. This is valid thing to happen if we are processing most recent era file.
I would say, we have two approaches:
- pass
next_era_path
asOption<String>
to the spawn task below, which would return error- or maybe even better, in case of an
EraType::Era
, it can try to download most recentera_files
- or maybe even better, in case of an
- if
next_era_path
isNone
, skip next few lines and leaveself.next_era
as empty (and update message at line 89)
I think I like option 1 more, but up to you.
trin-execution/src/era/manager.rs
Outdated
Ok(era_files[&(epoch_index)].clone()) | ||
} | ||
} | ||
Ok(block) | ||
} | ||
|
||
async fn fetch_era_file(&mut self) -> anyhow::Result<ProcessedEra> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this function name doesn't make much sense to me anymore. I would rename it to something like get_next_processed_era
and add comment describing what it actually does (getting processed era, and start processing next one).
trin-execution/src/era/manager.rs
Outdated
@@ -101,30 +86,29 @@ impl EraManager { | |||
let current_era = self.next_era.take(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I find this naming a bit confusing. How about:
let processed_era = match self.next_era.take() {
Some(era_handle) => era_handle.await??,
None => bail!("The next_era handle can't be None"),
};
Or at least new_current_era
if you don't like processed_era
?
Also, comment above this line is no longer valid.
trin-execution/src/era/manager.rs
Outdated
let era1_files = era_manager.era1_files.clone(); | ||
let http_client = era_manager.http_client.clone(); | ||
let join_handle = tokio::spawn(async move { | ||
Self::init_current_era(era1_files, http_client, next_block_number).await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It is a bit weird to call function init_current_era
and then assign it to the self.next_era
(while we have self.current_era
). How about we rename function to something like fetch_and_process_era_by_block_number()
?
smallest nit: up to you, I don't think join_handle
adds much to readability, so I would just have
era_manager.next_era = Some(tokio::spawn(async move {
or at least rename join_handle
to next_era
or next_era_handle
.
What was wrong?
How was it fixed?