Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make allocator configurable #264

Open
webmaster128 opened this issue Oct 21, 2021 · 15 comments
Open

Make allocator configurable #264

webmaster128 opened this issue Oct 21, 2021 · 15 comments

Comments

@webmaster128
Copy link
Member

There are reports from increasing memory consumption over time that disappear when a different allocator is used. So the problem disappears when using memory tracking tools like https://github.com/iovisor/bcc/blob/master/tools/memleak.py

https://stackoverflow.com/questions/59008623/memory-leak-disappears-when-using-valgrind discusses cases where this can happen.

Terra reports that the use of tcmalloc or jemalloc avoids the problem.

It can be fixed with using the jemallocator crate as shown in v0.16.1...terra-money:release/v0.16.1.

It would be great to upstream that fix, at least as an op-in feature. Also we should better understand why it helps.

@maurolacy
Copy link
Contributor

maurolacy commented Oct 21, 2021

This can be a case of heap fragmentation, as mentioned by user phd in the comments of the SO entry linked above. Memory fragmentation should impact performance, too. That could be a way to detect it.

Changing the allocator solves the problem, because the new allocator is better at dealing with memory fragmentation. Using a memory tracking tool probably forces the OS to behave better in terms of allocations, and / or to release previously allocated chunks more often.

See https://engineering.linkedin.com/blog/2021/taming-memory-fragmentation-in-venice-with-jemalloc.

@webmaster128
Copy link
Member Author

See https://engineering.linkedin.com/blog/2021/taming-memory-fragmentation-in-venice-with-jemalloc.

Nice read, thank you. So the reason might be the memory allocations are too different in size when caching is enabled, making it hard to keep fragmentation low.

@maurolacy
Copy link
Contributor

Yes. Small regions are allocated in free areas that are larger, and then new blocks are needed for big regions; and so on.

@yun-yeo
Copy link

yun-yeo commented Oct 30, 2021

Interesting factor we found is when we use 50~100MB cache size, the memory is definitely stable.

https://github.com/terra-money/core/issues/592#issuecomment-955711739

@ethanfrey
Copy link
Member

In addition to configuring the allocator (which seems like a useful enhancement even if we improve the cache memory usage), I think it would be good to reduce the cache memory leak potential.

My first idea is to occasionally refresh pinned contracts. By that, I mean every N runs (say 100) the module is dropped from the memory cache (freeing all related memory) and a new one is rebuild from an artifact loaded from the disk cache. This could be applied to all contracts in the cache, both pinned and LRU. Like max TTL, starting at 100 when added and decreasing each usage til it gets dropped and reloaded from disk.

Thinking of gas cost... if loading an instance from disk is 60k gas, adding a TTL of 100 means it would be 600 gas amortised, which is quite low. Not 0 but less work than reading one value from state, so quite the win for performance. If that is needed to keep memory usage stable under long-time heavy use, it seems a decent compromise to make.

(This is proposing a different issue, but related to the finding that triggered this issue)

@hanjukim
Copy link

We found that there is a significant difference between using shared and static library. When using a static library (libwasmvm_muslc.a), memory leakage occurs noticeably.

@ethanfrey
Copy link
Member

Is this due to default memory allocator between glibc (shared) builds and alpine (static) builds?

Curious if this still occurs if jemalloc is explicitly set for both builds.

@webmaster128
Copy link
Member Author

webmaster128 commented Dec 28, 2021

According to the article, the default allocator is implemented in the C standard libraray. This means different behaviour in shared lib (glibc) and static lib (musl libc) is expected.

When using a static library (libwasmvm_muslc.a), memory leakage occurs noticeably.

So you are saying the memory growth (leak or fragmentation) using the static lib is worse than the shared lib?

Curious if this still occurs if jemalloc is explicitly set for both builds.

True, we need 4 builds for shares/static × default allocator/jemalloc.

@hanjukim
Copy link

hanjukim commented Dec 28, 2021

Is this due to default memory allocator between glibc (shared) builds and alpine (static) builds?

Curious if this still occurs if jemalloc is explicitly set for both builds.

with both static/shared library + jemalloc patch (4 versions), I don't witness memory leakage so far (ran it about 6 hours)

@hanjukim
Copy link

hanjukim commented Dec 28, 2021

So you are saying the memory growth (leak or fragmentation) using the static lib is worse than the shared lib?

Yes. That's what I found. I could see a significant difference in a node with huge amount of inbound wasm queries.

@ethanfrey
Copy link
Member

My vote is to make allocator configurable and possibly jemalloc default now.
Using default allocator just in tests to hunt down the true cause.

@webmaster128
Copy link
Member Author

I'm concerned about gnzlbg/jemallocator#173 and thus hesitate to make jemallocator the default right now. Let's see if there is some progress within a few weeks.

@ethanfrey
Copy link
Member

ethanfrey commented Dec 28, 2021

Or some volunteers from the ecosystem to maintain it... (not Confio)

@maurolacy
Copy link
Contributor

Or some volunteers from the ecosystem to maintain it... (not Confio)

I think it's an interesting project to maintain. But yes, significantly out of Confio's focus.

@edjroz
Copy link

edjroz commented Apr 5, 2022

There are multiple community mantained jemalloc forks.

and as updated in a comment in gnzlbg/jemallocator#173 rustc has moved to use https://github.com/tikv/jemallocator

rust-lang/rust#83152

It should be possible to set this fork as the default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants