Deal Making Pain Points And Enhancements #7084

jennijuju · 2021-08-16T04:27:40Z

jennijuju
Aug 16, 2021
Maintainer

This is a discussion for collecting of the current pain points in deal-making for storage providers using lotus and brainstorm ideas on how to make improvements.

Related thread #6861 (comment)

jennijuju · 2021-08-16T04:31:03Z

jennijuju
Aug 16, 2021
Maintainer Author

Originally From @stuberman in #6861 (comment)

This is a general discussion on m1 and the deal making function of FileCoin based on my experiences last night. For the first time (running m1.3.5) I was able to see the system at full load and it was not pretty. This ended up being an 'all-nighter' as I attempted to resolve the core issue of high deal flow. I also encountered what appears to be a memory leak, but that is not the core issue.

Context: Running a dedicated market node m1.3.5 with bidbot (automated offline deals) and using CID-Gravity to manage deal prices and capacity. The sealing system has been able to process and seal up to 768 GiB/day of 32GiB sectors when clients arrange for large batches (100+) of offline deals. Download speed will be as good as 32GiB in 4-5 minutes. My verified deal prices is set to zero and my unverified prices is 0.000000012.

Assumption: Storage providers, Filecoin and clients require the maximum efficiency for the deal making pipeline, in order to as affordably as possible ingest deals at near hardware and network capacity. While offline deals can be planned, online deals are very unpredictable, necessitating mechanisms to help client systems manage deal flow such that they send valid deals in a manner that will not cause them to fail or timeout or be rejected for other reasons.

Problem #1 - Batch Published Deals (or PreCommit or Commit batches)
Batches should not fail when a single deal is not valid or expires prior to sealing. Last night I received three 32GiB bidbot deals and three Estuary deals (0.5 or 1GiB each) that were downloaded over the course of three hours and queued up for Batch Publishing. Upon the expiration of the batch publish timer, all six deals were discarded due to a problem (deal cap) with the Estuary deals. Not only is this a waste of precious resources, but the loss of good deal data is unfortunate and not something our clients (especially those with large data sets under time pressure) will find acceptable.

Problem #2 - Deal throttling required
Last night I received more than 100 Estuary (verified) smallish deals and 44 (verified) 32GiB deals in the course of 12 hours.
Three bidbot deals made it through sealing and one Estuary deal (and one random online deal). The rest failed.
The bidbot deals use an automated offline deal mechanism to download and then import the deals into Lotus. Bidbot has some local parameters such as "Maximum running total bytes in the deals to bid for a period of time."
The new markets node also has some throttling parameters such as "SimultaneousTransfers" and "ParallelFetchLimit" (for storage).
I am not familiar with any throttling for Estuary storage providers or other systems such as the DealBot.

As I monitored the market node last night I saw more than 100 fstmp* files in my TMPDIR and more than 60 transfers showing active in "Receiving Channels" when running lotus-miner data-transfers list just within 6 hours. Under the best of conditions that deal load would take about 12 hours to process and seal (my MaxSealingSectorsForDeals setting is '18'). Consider if this deal load continued without stopping this would amount to 48 sectors per day, more than double my sealing capacity. The result would be a large backlog of downloaded deals that would ultimately expire and fail due to start-epoch math. (In this case, a memory leak caused the market node to crash as the deal load grew and grew or after numerous restarts as the deal processes tried to catch up. I deleted all fstmp files and lowered various parameters to restrict deal flows. For instance CID-Gravity allows me to limit the number of deals (not bytes) per client per hour (however it currently does not allow me to limit the deal flow across all clients in aggregate).

The Challenge
How can we throttle incoming deals from a variety of client systems (DealBot/bidbot/Estuary/preplanned offline deals/random online clients) such that we are not accepting too many deals and overloading the markets nodes? While we can choose to upgrade and expand our capacities, the issue of deal flow pipeline capacity management is something that needs a good mechanism to control and manage. I know there are many new clients with pebibytes of data anxiously awaiting storage providers to properly and seamlessly store this data.

Note
Since sectors =! deals the current state of sector management which does work also does not apply to this issue.
We also have to look at the network and market node lode of retrievals as that is now in scope as well.

0 replies

jennijuju · 2021-08-16T04:32:27Z

jennijuju
Aug 16, 2021
Maintainer Author

From @SBudo in #6861 (reply in thread)

I totally agree with all the points made above.
We also had some deals going down the drain while doing a publishing aggregation. This needs to be addressed as soon as possible (or could we all agree to reduce the burn fee associated with publishing?).
Furthermore, I would also add, for the smaller miners, that a mechanism to check that there is sufficient FIL on the worker address before accepting deals would be absolutely awesome!

0 replies

benjaminh83 · 2021-08-16T11:55:31Z

benjaminh83
Aug 16, 2021

@jennijuju I have put in a request on lotus-farcaster for @s0nik42 to take the whole deal monitoring to the next level. While we need handles in lotus to control the inflow of deals in a more granular way to ensure efficiency and avoid overload, it seems like the monitoring should probably be somewhere else. I would like to push for farcaster to be upgraded, so it can give overviews, that is nowhere possible with CLI command lines... I put it in here:
s0nik42/lotus-farcaster#30

I know this does not solve the core of the problem, experienced by @stuberman, but it could offload that part, that does not make sense to try to do in CLI. my lotus-miner storage-deal list is growing with 1000 lines a day right now. This is not the way to monitor inflow of deals :)

1 reply

jennijuju Aug 16, 2021
Maintainer Author

Great proposal - using lotus cmd for long term operations monitoring and management def is not sustainable! @s0nik42 looking forward to seeing this being implemented in farcaster and let us know if you miss anything from jsonrpc api to get this supported!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deal Making Pain Points And Enhancements #7084

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Deal Making Pain Points And Enhancements #7084

jennijuju Aug 16, 2021 Maintainer

Replies: 3 comments · 1 reply

jennijuju Aug 16, 2021 Maintainer Author

Originally From @stuberman in #6861 (comment)

jennijuju Aug 16, 2021 Maintainer Author

From @SBudo in #6861 (reply in thread)

benjaminh83 Aug 16, 2021

jennijuju Aug 16, 2021 Maintainer Author

jennijuju
Aug 16, 2021
Maintainer

Replies: 3 comments 1 reply

jennijuju
Aug 16, 2021
Maintainer Author

jennijuju
Aug 16, 2021
Maintainer Author

benjaminh83
Aug 16, 2021

jennijuju Aug 16, 2021
Maintainer Author