Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update synchronization logic for ckb-indexer/light client #90

Closed
Keith-CY opened this issue Jan 3, 2023 · 32 comments
Closed

Update synchronization logic for ckb-indexer/light client #90

Keith-CY opened this issue Jan 3, 2023 · 32 comments
Assignees
Labels
enhancement New feature or request

Comments

@Keith-CY
Copy link
Member

Keith-CY commented Jan 3, 2023

Feature has been discussed in #52

@Keith-CY
Copy link
Member Author

Keith-CY commented Jan 4, 2023

@Keith-CY add PRD about switch between ckb-index(full node) and light client

@Keith-CY
Copy link
Member Author

Keith-CY commented Jan 7, 2023

  1. Request light client to add an API of light client info, with that neuron could detect the service of an endpoint(ckb node/ckb light client): Adding an API to get light client info nervosnetwork/ckb-light-client#118
  2. Add a built-in light client in network list
    image
  3. Make the tag of networks detailed: Mainnet/Testnet/Devnet Node, Mainnet/Testnet/Devnet Light Client, the tag shows in preference/settings network and sync progress at the bottom left corner.
  4. If the built-in light client is connected
    1. preference/settings -> data acts on built-in light client
      1. For the path set: CKB Node Config & Storage -> CKB Light Client Config & Storage, the tooltip should be updated too.
      2. For the cache:
      1. date of cache cleared of the built-in light client is separate from that of the built-in full node. Namely, there will be two dates of cache cleared and displayed according to the network type;
      2. fully rebuild index option in clear cache dialog is hidden
    2. menu -> tools -> clear all synchronized data works on built-in light client, light client should re-sync.
  5. Add the version of the built-in light client in About Neuron(macOS: menu -> neuron -> about neuron, windows: menu -> help -> about neuron
    image

@yanguoyu
Copy link

yanguoyu commented Jan 8, 2023

How to switch light client or full node for users?

@Keith-CY
Copy link
Member Author

Keith-CY commented Jan 8, 2023

How to switch light client or full node for users?

Switch between light client and full node by selecting different networks, i.e. Neuron will become network-agnostic.

If the feature request of adding a light client info API mentioned above could be supported, Neuron can detect whether the remote service is a full node or a light client by calling local_node_info and light_client_info:

  1. local_node_info responds while light_client_info fails => full node;
  2. local_node_info fails while light_client_info responds => light client;
  3. local_node_info fails while light_client_info fails => unknown service.

@yanguoyu
Copy link

yanguoyu commented Jan 8, 2023

As I know, now Neuron can start with testnet, but it's hard for general users to use it with testnet, because ckb network can only be set when init.
So I mean if users want to use Neuron with light client, do they also need to start a light client, Or we will provide choices for users to start a light client or full node?

@Keith-CY
Copy link
Member Author

Keith-CY commented Jan 8, 2023

As I know, now Neuron can start with testnet, but it's hard for general users to use it with testnet, because ckb network can only be set when init. So I mean if users want to use Neuron with light client, do they also need to start a light client, Or we will provide choices for users to start a light client or full node?

A built-in light client connected to mainnet will be provided by Neuron, as mentioned in point 2 in #90 (comment)

Once the built-in light client is selected, Neuron boosts the light client inside it.

So we provide the options of connecting to internal full node and internal light client by 2 options in the network list

  1. default node => built-in ckb full node
  2. default light client => built-in ckb light client

@Keith-CY Keith-CY removed this from the 2023/01/11 - 2023/01/18 milestone Jan 9, 2023
@Keith-CY
Copy link
Member Author

Keith-CY commented Jan 10, 2023

Recommendations from the core team:

It's not recommended to allow users to set the entrypoint of a light client freely because light client and full node are based on different security assumptions.

Users should be clearly notified that the network is a light client, then they can connect to that one.


With this recommendation, the feature would be as follows:

  1. Add 2 built-in light clients Light Client Mainnet and Light Client Testnet' in the network list, Light Client MainnetandLight Client Testnet` are not editable;
    image

  2. Make the tag of networks detailed: Mainnet/Testnet/Devnet Node, Mainnet/Testnet/Devnet Light Client, the tag shows in preference/settings network and sync progress at the bottom left corner.

  3. If the built-in light client is connected

    1. preference/settings -> data acts on built-in light client
      1. For the path set: CKB Node Config & Storage -> CKB Light Client Mainnet Config & Storage, the tooltip should be updated too.(CKB Light Client Testnet Config & Storage if connected to testnet)
      2. For the cache:
      1. date of cache cleared of the built-in light clients are independent, respectively, and separate from that of the built-in full node. Namely, there will be three dates of cache cleared and displayed according to the network type(CKB Node Mainnet, CKB Light Client Mainnet, CKB Light Client Testnet);
      2. fully rebuild index option in clear cache dialog is hidden
    2. menu -> tools -> clear all synchronized data works on the built-in light client, the light client should re-sync, light clients of mainnet and testnet work independently.
  4. Add the version of the built-in light client in About Neuron(macOS: menu -> neuron -> about neuron, windows: menu -> help -> about neuron
    image

  5. If an external light client is detected(port 9000), it prompts users with the message Failed to start the CKB light client, please check if there's an external one running with button Dismiss and keep retrying start the built-in one.

  6. Light Client Mainnet network option is hidden for now because the light client feature is not activated on mainnet.

@yanguoyu
Copy link

Does the Light Client Mainnet has a schedule to publish?

@Keith-CY
Copy link
Member Author

Does the Light Client Mainnet has a schedule to publish?

Not yet

@Keith-CY
Copy link
Member Author

An optional parameter SetScriptCommand was added in set_script API which is for cases as HD wallet derivation
Ref: https://github.com/nervosnetwork/ckb-light-client#set_scripts

@Keith-CY
Copy link
Member Author

CKB Light [email protected] was released and could be used as the built-in one for development

@Keith-CY
Copy link
Member Author

CKB Light [email protected] includes a portable version for macOS m1

@yanguoyu yanguoyu moved this from Todo to In Progress in Nervos Wallet/Explorer Jan 20, 2023
@yanguoyu
Copy link

I have two problems with this:

  1. How to calculate the left time when syncing with light client? Total scripts left block numbers * Fixed speed, Or the speed we need to calculate by Total synced block numbers/ cost time?
  2. I found when I add new addresses to sync by the light client, It will pause synced higher script until the new scripts sync to the higher block number. So maybe we need to create more addresses once time like 60 received addresses and 30 change addresses.

@Keith-CY

@quake
Copy link

quake commented Feb 23, 2023

I have two problems with this:

  1. How to calculate the left time when syncing with light client? Total scripts left block numbers * Fixed speed, Or the speed we need to calculate by Total synced block numbers/ cost time?

You can estimate the time by calculating the progress of min_block_number / tip_number (min_block_number = min(get_scripts.block_number), tip_number = get_tip_header.number)

@Keith-CY
Copy link
Member Author

I have two problems with this:

  1. How to calculate the left time when syncing with light client? Total scripts left block numbers * Fixed speed, Or the speed we need to calculate by Total synced block numbers/ cost time?

The remaining time can be estimated by the rule suggested by @quake but it would be a bit confusing because scripts are synced group by group, if the next group of scripts should be derived is uncertain until the current one has fully synced. If the next group of scripts will be derived, the time of sync will be longer. So the estimated time would be like almost done => need more time => almost done => need more time, any idea about this? @Danie0918

  1. I found when I add new addresses to sync by the light client, It will pause synced higher script until the new scripts sync to the higher block number. So maybe we need to create more addresses once time like 60 received addresses and 30 change addresses.

I didn't get the point. Do you mean, synchronization of groupA will stop if groupB is derived by groupA, until groupB syncs to the same block where groupA reached?

@yanguoyu
Copy link

I have two problems with this:

  1. How to calculate the left time when syncing with light client? Total scripts left block numbers * Fixed speed, Or the speed we need to calculate by Total synced block numbers/ cost time?

You can estimate the time by calculating the progress of min_block_number / tip_number (min_block_number = min(get_scripts.block_number), tip_number = get_tip_header.number)

Got it, for example, min_block_number= 1,000,000, tip_number = 9,000,000, and it has spent 1 hour, then the left time can estimate to 8 hours.

@yanguoyu
Copy link

I didn't get the point. Do you mean, synchronization of groupA will stop if groupB is derived by groupA, until groupB syncs to the same block where groupA reached?

I guess yes, I think it will sync the group that its block_number is smaller, utils all the groups have the same block_number, then they will sync at the same time.

@quake
Copy link

quake commented Feb 23, 2023

2. I found when I add new addresses to sync by the light client, It will pause synced higher script until the new scripts sync to the higher block number. So maybe we need to create more addresses once time like 60 received addresses and 30 change addresses.

are you setting the starting block number new derived address to 0? you may set it to the block number of last change or receiving address transaction occurred.

@yanguoyu
Copy link

The remaining time can be estimated by the rule suggested by @quake but it would be a bit confusing because scripts are synced group by group, if the next group of scripts should be derived is uncertain until the current one has fully synced. If the next group of scripts will be derived, the time of sync will be longer. So the estimated time would be like almost done => need more time => almost done => need more time, any idea about this? @Danie0918

The estimate may be not exact when creating new group addresses, and it's the same as the synced block number,
the synced block number is possibly changing from big to small when calling set_script with a new group.

@Keith-CY
Copy link
Member Author

I didn't get the point. Do you mean, synchronization of groupA will stop if groupB is derived by groupA, until groupB syncs to the same block where groupA reached?

I guess yes, I think it will sync the group that its block_number is smaller, utils all the groups have the same block_number, then they will sync at the same time.

As we designed at #52 (comment)

Working with the light client, Neuron could sync each key separately because each key has its own cursor/progress. That means it's safe to postpone key derivation until the derived keys are all processed totally. With this characteristic, Neuron could divide synchronization into groups of keys instead of each block, which is efficient.

The derivation would only occur on the last group of addresses is fully synced, which means it reaches the tip block number. So it's fine to stop synchronization of the last group temporarily

@yanguoyu
Copy link

The derivation would only occur on the last group of addresses is fully synced, which means it reaches the tip block number. So it's fine to stop synchronization of the last group temporarily

Group A synced to header block number -> derived Group B -> Group B synced to header block number
Do you mean like this? If so, it may be synced slowly. Because it means A from 0 to max and B from 0 to max,
but I think A from 0 to block_number_A, B from 0 to block_number_A, and A+B from block_number_A to header tip is faster.

@yanguoyu
Copy link

yanguoyu commented Feb 23, 2023

are you setting the starting block number new derived address to 0? you may set it to the block number of last change or receiving address transaction occurred.

Yes, I set the new derived address's start block number to 0.
Is there a possible deriving address that has a transaction occurring before the block number of last change or receiving address transaction occurred? @Keith-CY

I think it's a good idea If users use the wallet by Neuron, there will not exist transactions with the derived addresses before they are created.

@Keith-CY
Copy link
Member Author

The derivation would only occur on the last group of addresses is fully synced, which means it reaches the tip block number. So it's fine to stop synchronization of the last group temporarily

Group A synced to header block number -> derived Group B -> Group B synced to header block number Do you mean like this? If so, it may be synced slowly. Because it means A from 0 to max and B from 0 to max, but I think A from 0 to block_number_A, B from 0 to block_number_A, and A+B from block_number_A to header tip is faster.

The previous workflow would be simple, and a bit performant because the check if the next group of scripts should be derived is executed once instead of every time a transaction is detected.

@Keith-CY
Copy link
Member Author

are you setting the starting block number new derived address to 0? you may set it to the block number of last change or receiving address transaction occurred.

Yes, I set the new derived address's start block number to 0. Is there a possible deriving address that has a transaction occurring before the block number of last change or receiving address transaction occurred? @Keith-CY

I think it's a good idea If users use the wallet by Neuron, there will not exist transactions with the derived addresses before they are created.

But users may not only use Neuron with the same seed. It's possible that an address is used when it's not derived in Neuron

@Keith-CY
Copy link
Member Author

Keith-CY commented Feb 23, 2023

The progress/estimated time could be improved later, as mentioned at #52 (comment)

What's more, each script could have its own progress bar and refresh itself independently. If the user declares address A has an invisible asset, he/she could resync the group only.

Each address could have its progress bar, and the incremental progress could be computed from them.

For example, there are 3 groups fully synced, but a new group is generated, the progress turns from 100% to 75%. The more addresses used, the less fallback it will be.

@quake
Copy link

quake commented Feb 23, 2023

But users may not only use Neuron with the same seed. It's possible that an address is used when it's not derived in Neuron

This use case exists only in theory, do you know of any actual cases? Let's keep it simple and ignore the use case of sharing the same seed but use different derivation strategies with neuron.

@yanguoyu
Copy link

For example, there are 3 groups fully synced, but a new group is generated, the progress turns from 100% to 75%. The more addresses used, the less fallback it will be.

Show every progress for every address is good, and If so we should hide the block number of the total at the left-bottom of Neuron. Because it's difficult to calculate.
But after all groups have synced to the header tip, we also need to check whether we should create a new group when a transaction has synced. And I think checking whether we need to derive new addresses is not a performance bottleneck.
The performance bottleneck is node sync speed.

On the other hand, should we derive more addresses once to quicken the sync speed?

@Keith-CY
Copy link
Member Author

Keith-CY commented Feb 23, 2023

But users may not only use Neuron with the same seed. It's possible that an address is used when it's not derived in Neuron

This use case exists only in theory, do you know of any actual cases? Let's keep it simple and ignore the use case of sharing the same seed but use different derivation strategies with neuron.

Feasible, a refresh button could be added(in future) next to each address to update transactions of a specific address as mentioned at #52

What's more, each script could have its own progress bar and refresh itself independently. If the user declares address A has an invisible asset, he/she could resync the group only.

So it would be easy to fix data missing in Neuron

cc @yanguoyu

@Keith-CY
Copy link
Member Author

For example, there are 3 groups fully synced, but a new group is generated, the progress turns from 100% to 75%. The more addresses used, the less fallback it will be.

Show every progress for every address is good, and If so we should hide the block number of the total at the left-bottom of Neuron. Because it's difficult to calculate. But after all groups have synced to the header tip, we also need to check whether we should create a new group when a transaction has synced. And I think checking whether we need to derive new addresses is not a performance bottleneck. The performance bottleneck is node sync speed.

Got it

On the other hand, should we derive more addresses once to quicken the sync speed?

I would prefer to keep the count of addresses to generate, the first goal is to enable light client in Neuron, then the user experience.

@yanguoyu
Copy link

@Danie0918 Danie0918 added this to Neuron Feb 26, 2023
@Danie0918 Danie0918 moved this to 👀 In Review in Neuron Feb 26, 2023
@yanguoyu yanguoyu moved this from In Progress to QA in Nervos Wallet/Explorer Feb 27, 2023
@Danie0918
Copy link
Contributor

@yanguoyu
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Archived in project
Status: QA
Development

No branches or pull requests

5 participants