Provide simple option for limiting total memory usage #1268

gonzojive · 2020-03-21T17:38:12Z

What version of Go are you using (`go version`)?

$ go version
go version go1.14 linux/amd64

What version of Badger are you using?

v1.6.0

Does this issue reproduce with the latest master?

As far as I know, yes.

What are the hardware specifications of the machine (RAM, OS, Disk)?

32 GB RAM
AMD Ryzen 9 3900X 12-core, 24-Thread
1 TB Samsung SSD

What did you do?

Used the default options to populate a table with about 1000 key/val pairs where each value is roughly 30MB.

The badger database directory is 101GB according to du. There are 84 .vlog files.

When I start my server up, it quickly consumes 10 GB of ram and dies due to OOM. dmesg output:

[654397.093709] Out of memory: Killed process 15281 (taskserver) total-vm:20565228kB, anon-rss:12610116kB, file-rss:0kB, shmem-rss:0kB

What did you expect to see?

I would expect the database to provide a simple option to limit memory usage to an approximate cap.

What did you see instead?

The recommended mechanism of tweaking a many-dimension parameter space is confusing and hasn't worked for me.
The memory related parameters are not explained in much detail. For example, the docstring for options.MemoryMap doen't indicate roughly how expensive MemoryMap is vs FileIO.
I haven't managed to successfully reduce memory usage using the following parameters:

func opts(dbPath string) badger.Options {
	return badger.DefaultOptions(dbPath).
		WithValueLogLoadingMode(options.FileIO).
		WithTableLoadingMode(options.FileIO).
		WithNumMemtables(1)
}

I can create an example program if the issue is of interest.

The text was updated successfully, but these errors were encountered:

Kleissner · 2020-03-26T00:28:36Z

Agree. We use it for a simple key-value lookup with a couple of billion records (database directory is 700 GB).

It uses about 200 GB of RAM which is unacceptable. The culprit are memory mapped files according to Process Explorer.
Good thing we have a lot of RAM, but there should be an easy well-defined max memory limit to set.

gonzojive · 2020-03-26T15:29:05Z

I am attempting to restore from backup and running out of memory. Here's a memory profile:

latest options:

func badgerOpts(dbPath string) badger.Options {
	return badger.DefaultOptions(dbPath).
		WithValueLogLoadingMode(options.FileIO).
		WithTableLoadingMode(options.FileIO).
		WithNumMemtables(1).
		WithCompression(options.Snappy).
		WithKeepL0InMemory(false).WithLogger(&gLogger{})
}

jarifibrahim · 2020-03-26T15:48:05Z

@gonzojive I will have a look at the high memory usage. What is the size of your data directory and the size of the backup file?

gonzojive · 2020-03-26T16:20:33Z

The backup file is 100GB (99,969,891,936 bytes)

In this case, backup.go's Load function. seems to be a major offender. It does not account for the size of the values at all. Added logging shows huge key/value accumulation and no flushing:

I0326 09:16:43.884962    5195 taskstorage.go:147] not flushing with 1269 entries, 73.2K key size, 3.2G combined size, 9.6M limit

I'm guessing there are many places where value size is not accounted for when making memory management decisions.

gonzojive · 2020-03-26T16:54:52Z

I modified backup.go to flush when accumulated key + value size exceeds 100 MB. I can send a pull request for this at some point.

Before backup.go modifications the process consumes memory until the OS kills it:

After:
(it's basically flat at 21 GB).... but I didn't manage to grab a screenshot because of Ubuntu/gnome flakiness.

When I set the threshold to 500MB instead of 100 MB, memory usage still causes a crash for some reason.

The previous behavior only accouned for key size. For databases where keys are small (e.g., URLs) and values are much larger (megabytes), OOM errors were easy to trigger during restor operations. This change does not set the threshold use for flushing elegantly - a const is used instead of a configurable option. Related to dgraph-io#1268, but not a full fix.

stale · 2020-04-25T17:16:22Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

gonzojive · 2020-04-25T22:49:29Z

Although I have a fix for the backup restoration issue, this issue as a whole has not been addressed.

I'm not aware of what causes badger to take up the amount of memory that it does. That seems like the first step towards introducing a flag for setting a fixed memory limit. May someone from the badger team weigh in?

jarifibrahim · 2020-04-27T06:55:53Z

I'm not aware of what causes badger to take up the amount of memory that it does. That seems like the first step towards introducing a flag for setting a fixed memory limit. May someone from the badger team weigh in?

The amount of memory being used depends on your DB options. For instance, each table has a bloom filter and these bloom filters are kept in memory. Each bloomfilter takes up 5 MB of memory. So if you have 100 GB of data, that means you have (1001000/64) = 1562 tables, and 15625 MB is about 7.8 GB of memory. So your bloom filters alone would take up 7.8 GB of memory. We have a separate cache in badger v2 to reduce the memory used by bloom filters.

Other things that might affect memory usage is the table loading mode. If you set the table loading mode to fileIO, the memory usage should reduce but then your reads would be very slow.

stale · 2020-05-27T12:11:44Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

gonzojive · 2020-05-30T16:49:32Z

Perhaps something else to keep in mind when tracking down memory hog issues: The Go memory profile doesn't seem to capture the full extent of memory usage.

Here is a screenshot that shows the system's accounting (12.7 GB) vs Go's accounting (84.34 MB).

gonzojive · 2020-05-30T17:13:51Z

Here are the runtime.Memstats for a similar process to the screenshot above.

edit: It could be that the OS is not reclaiming memory freed by Go as discussed in this bug: golang/go#14521. However, I'm not sure how to confirm this. Badger also makes low-level system calls that might not be tracked by the above memory profiles (mmap).

# runtime.MemStats
# Alloc = 8994773352
# TotalAlloc = 142559328392
# Sys = 19054750096
# Lookups = 0
# Mallocs = 173259
# Frees = 161722
# HeapAlloc = 8994773352
# HeapSys = 18450841600
# HeapIdle = 9454788608
# HeapInuse = 8996052992
# HeapReleased = 3498221568
# HeapObjects = 11537
# Stack = 4063232 / 4063232
# MSpan = 135320 / 180224
# MCache = 8680 / 49152
# BuckHashSys = 1528578
# GCSys = 596795800
# OtherSys = 1291510
# NextGC = 4058104544
# LastGC = 1590858449129344940
# PauseNs = [720764 12044 12475 7990262 7809221 10238422 11080910 26676777 11483433 50231078 15615060 6761507 9593387 21339990 30935701 48671278 33504211 28017768 14602732 6955500 39479912 9759023 76460498 15589806 25668442 15236919 15399 8833874 12794857 58970453 10943138 13950377 16293396 32542175 8410993 13622020 12043529 32008031 12635226 15306547 27373405 13418150 23685828 68681901 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
# PauseEnd = [1590858409431780297 1590858409480216799 1590858414430519349 1590858419259323466 1590858419275574116 1590858419335061131 1590858419470989711 1590858419820272840 1590858419912902151 1590858420513245024 1590858422213604632 1590858423000299952 1590858424082382764 1590858424291966196 1590858424705682993 1590858425123361202 1590858425954499123 1590858427235619883 1590858427997664669 1590858429465166747 1590858429700411478 1590858429977301044 1590858430767177012 1590858432254219118 1590858432726467896 1590858434046992645 1590858434430874744 1590858434750260245 1590858434836953495 1590858435420921176 1590858436777906262 1590858437837892960 1590858438434330473 1590858439224690949 1590858439618659050 1590858440508708633 1590858441142277899 1590858442053297406 1590858443890553903 1590858444739211063 1590858446473432441 1590858447066636466 1590858447807950895 1590858449129344940 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
# NumGC = 44
# NumForcedGC = 7
# GCCPUFraction = 0.034290922233429534
# DebugGC = false

gonzojive · 2020-05-30T22:44:57Z

On the other hand, sometimes memory usage is quite high & there is a lot of allocing.

I can't find a way to force the OS to reclaim the memory freed by Go, which seems to use MADV_FREE on recent linux version (https://golang.org/src/runtime/mem_linux.go). It would be helpful to force the OS to reclaim such memory get a more accurate picture of what's going on.

in use

allocs

gonzojive · 2020-05-31T00:05:05Z

In my case, it would help if prefetchValues had an option to restrict prefetches based on value byte size, not number of values. Perhaps the IteratorOptions could become

// IteratorOptions is used to set options when iterating over Badger key-value
// stores.
//
// This package provides DefaultIteratorOptions which contains options that
// should work for most applications. Consider using that as a starting point
// before customizing it for your own needs.
type IteratorOptions struct {
	// Indicates whether we should prefetch values during iteration and store them.
	PrefetchValues bool
	// How many KV pairs to prefetch while iterating. Valid only if PrefetchValues is true.
	PrefetchSize int
	// If non-zero, specifies the maximum number of bytes to prefetch while
	// prefetching iterator values. This will overrule the PrefetchSize option
	// if the values fetched exceed the configured value.
	PrefetchBytesSize int
	Reverse           bool // Direction of iteration. False is forward, true is backward.
	AllVersions       bool // Fetch all valid versions of the same key.

	// The following option is used to narrow down the SSTables that iterator picks up. If
	// Prefix is specified, only tables which could have this prefix are picked based on their range
	// of keys.
	Prefix      []byte // Only iterate over this given prefix.
	prefixIsKey bool   // If set, use the prefix for bloom filter lookup.

	InternalAccess bool // Used to allow internal access to badger keys.
}

Even better would be a database-wide object for restricting memory use to a strict cap.

jarifibrahim · 2020-06-01T09:34:43Z

@gonzojive How big are your values? The memory profile you shared shows that y.Slice was holding 15 GB of data. That's unusual unless you have a big value.

I can't find a way to force the OS to reclaim the memory freed by Go, which seems to use MADV_FREE on recent linux version (https://golang.org/src/runtime/mem_linux.go). It would be helpful to force the OS to reclaim such memory get a more accurate picture of what's going on.

debug.FreeOSMemory() https://golang.org/pkg/runtime/debug/#FreeOSMemory is what you're looking for

From https://golang.org/pkg/runtime/,

    // HeapIdle minus HeapReleased estimates the amount of memory
    // that could be returned to the OS, but is being retained by
    // the runtime so it can grow the heap without requesting more
    // memory from the OS. If this difference is significantly
    // larger than the heap size, it indicates there was a recent
    // transient spike in live heap size.
    HeapIdle uint64

So heapIdle - heapreleased in your case is

>>> (9454788608-3498221568) >> 20
5680

which is 5.6 GB. That's the amount of memory golang runtime is holding.

gonzojive · 2020-06-03T14:50:30Z

In this case, the values are 25 MB or more. The memory usage was from prefetching 100 values for each request, and many requests are run in parallel. Limiting prefetching fixed the specific issue I was having, but the general feature request remains open.

jarifibrahim · 2020-06-04T13:51:32Z

Ah, that makes sense. Thanks for debugging it @gonzojive . The feature request still remains open.

minhaj-shakeel · 2020-07-20T18:14:01Z

Github issues have been deprecated.
This issue has been moved to discuss. You can follow the conversation there and also subscribe to updates by changing your notification preferences.

jarifibrahim added the area/documentation Documentation related issues. label Mar 22, 2020

gonzojive mentioned this issue Mar 27, 2020

Account for value size when restoring key/values from a backup. #1278

Closed

looztra mentioned this issue Apr 10, 2020

Excessive memory consumption? salesforce/sloop#112

Closed

stale bot added the status/stale The issue hasn't had activity for a while and it's marked for closing. label Apr 25, 2020

stale bot removed the status/stale The issue hasn't had activity for a while and it's marked for closing. label Apr 25, 2020

bonedaddy mentioned this issue May 11, 2020

Badger Allocates A Lot Of Memory When Iterating Over Large Key Value Stores #1326

Closed

stale bot added the status/stale The issue hasn't had activity for a while and it's marked for closing. label May 27, 2020

jarifibrahim added the status/accepted We accept to investigate or work on it. label May 27, 2020

stale bot removed the status/stale The issue hasn't had activity for a while and it's marked for closing. label May 27, 2020

jarifibrahim added the kind/enhancement Something could be better. label May 27, 2020

gonzojive changed the title ~~Reducing memory usage is complicated~~ Provide simple option for limiting total memory usage May 30, 2020

minhaj-shakeel closed this as completed Jul 21, 2020

patrick-ogrady mentioned this issue Sep 14, 2020

[storage] Overhaul Badger Defaults coinbase/mesh-sdk-go#152

Merged

3 tasks

siddharthlatest mentioned this issue Dec 5, 2021

Fix/badger mem usage appbaseio/reactivesearch-api#219

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide simple option for limiting total memory usage #1268

Provide simple option for limiting total memory usage #1268

gonzojive commented Mar 21, 2020

Kleissner commented Mar 26, 2020 •

edited

Loading

gonzojive commented Mar 26, 2020

jarifibrahim commented Mar 26, 2020

gonzojive commented Mar 26, 2020 •

edited

Loading

gonzojive commented Mar 26, 2020 •

edited

Loading

stale bot commented Apr 25, 2020

gonzojive commented Apr 25, 2020

jarifibrahim commented Apr 27, 2020

stale bot commented May 27, 2020

gonzojive commented May 30, 2020

gonzojive commented May 30, 2020 •

edited

Loading

gonzojive commented May 30, 2020

gonzojive commented May 31, 2020 •

edited

Loading

jarifibrahim commented Jun 1, 2020 •

edited

Loading

gonzojive commented Jun 3, 2020

jarifibrahim commented Jun 4, 2020

minhaj-shakeel commented Jul 20, 2020

Provide simple option for limiting total memory usage #1268

Provide simple option for limiting total memory usage #1268

Comments

gonzojive commented Mar 21, 2020

What version of Go are you using (go version)?

What version of Badger are you using?

Does this issue reproduce with the latest master?

What are the hardware specifications of the machine (RAM, OS, Disk)?

What did you do?

What did you expect to see?

What did you see instead?

Kleissner commented Mar 26, 2020 • edited Loading

gonzojive commented Mar 26, 2020

jarifibrahim commented Mar 26, 2020

gonzojive commented Mar 26, 2020 • edited Loading

gonzojive commented Mar 26, 2020 • edited Loading

stale bot commented Apr 25, 2020

gonzojive commented Apr 25, 2020

jarifibrahim commented Apr 27, 2020

stale bot commented May 27, 2020

gonzojive commented May 30, 2020

gonzojive commented May 30, 2020 • edited Loading

gonzojive commented May 30, 2020

gonzojive commented May 31, 2020 • edited Loading

jarifibrahim commented Jun 1, 2020 • edited Loading

gonzojive commented Jun 3, 2020

jarifibrahim commented Jun 4, 2020

minhaj-shakeel commented Jul 20, 2020

What version of Go are you using (`go version`)?

Kleissner commented Mar 26, 2020 •

edited

Loading

gonzojive commented Mar 26, 2020 •

edited

Loading

gonzojive commented Mar 26, 2020 •

edited

Loading

gonzojive commented May 30, 2020 •

edited

Loading

gonzojive commented May 31, 2020 •

edited

Loading

jarifibrahim commented Jun 1, 2020 •

edited

Loading