Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAM usage grows until it "hits" some limit #113

Closed
johan-bjareholt opened this issue Apr 3, 2020 · 13 comments
Closed

RAM usage grows until it "hits" some limit #113

johan-bjareholt opened this issue Apr 3, 2020 · 13 comments

Comments

@johan-bjareholt
Copy link
Member

When aw-server-rust initially starts the ram usage is very low (res 9MB on my machine, less than aw-watcher-afk/window at 30-40mb) but after a while it grows to a bit above 100mb and stays there.

To reproduce, start aw-server-rust and see it use 9MB RAM, go to the web-ui activity view and click "refresh" a few times until you see aw-server-rust grow in ram usage. For me it stopped growing at around 113MB of RAM usage.

@johan-bjareholt
Copy link
Member Author

If I look at the "Year" activity which is disabled in the web-ui by default and refresh that a few times it grows to around 1.1GB of RAM usage and stops there...

@johan-bjareholt
Copy link
Member Author

johan-bjareholt commented Apr 3, 2020

Seems to be SQLite doing some caching, seems to be possible to configure it to not cache as much in memory with different PRAGMAs.

==659099== 2,105,376 bytes in 482 blocks are possibly lost in loss record 285 of 285
==659099==    at 0x483977F: malloc (vg_replace_malloc.c:309)
==659099==    by 0x3FF26A: sqlite3MemMalloc (sqlite3.c:23057)
==659099==    by 0x3FFB61: mallocWithAlarm (sqlite3.c:26889)
==659099==    by 0x3FFC39: sqlite3Malloc (sqlite3.c:26919)
==659099==    by 0x40EAA7: pcache1Alloc (sqlite3.c:49204)
==659099==    by 0x40ECEA: pcache1AllocPage (sqlite3.c:49301)
==659099==    by 0x40F7CB: pcache1FetchStage2 (sqlite3.c:49768)
==659099==    by 0x40F932: pcache1FetchNoMutex (sqlite3.c:49872)
==659099==    by 0x40F97B: pcache1FetchWithMutex (sqlite3.c:49887)
==659099==    by 0x40F9D4: pcache1Fetch (sqlite3.c:49910)
==659099==    by 0x40DBF5: sqlite3PcacheFetch (sqlite3.c:48353)
==659099==    by 0x4155DF: getPageNormal (sqlite3.c:56269)

@johan-bjareholt
Copy link
Member Author

I don't think it's SQLite anymore, it was only partial loss of 2MB which is nothing.
Seems to be something when cloning Events.

@johan-bjareholt
Copy link
Member Author

This is likely the issue

https://doc.rust-lang.org/nomicon/leaking.html#drain

@ErikBjare
Copy link
Member

ErikBjare commented Apr 16, 2020

I've read that and think I somewhat understand the issue, but what would a fix look like in our case?

@johan-bjareholt
Copy link
Member Author

johan-bjareholt commented Apr 16, 2020

Alright, just realized that it's not drain either.

I made this example program to reproduce the issue

use std::io::Write;

trait New {
    fn new() -> Self;
}

#[derive(Clone)]
struct StackType {
    num: i64
}

impl New for StackType {
    fn new() -> Self {
        Self { num: 0 }
    }
}

#[derive(Clone)]
struct HeapType {
    num: Box<i64>
}

impl New for HeapType {
    fn new() -> Self {
        Self { num: Box::new(0) }
    }
}

const N : usize = 10000000;

fn alloc<T: New + Clone>() {
    let _vec : Vec<T> = vec![T::new(); N];
}

pub fn wait_for_enter(prompt: &str) {
    print!("{}", prompt);
    std::io::stdout().flush().unwrap();
    let mut input = String::new();
    std::io::stdin().read_line(&mut input).unwrap();
}

fn main() {
    println!("Alloc stack");
    alloc::<StackType>();
    wait_for_enter("Dropped vec with stack allocated variables, press enter to continue");
    println!("Alloc heap");
    alloc::<HeapType>();
    wait_for_enter("Dropped vec with heap allocated variables, press enter to quit");
}

Allocations on the stack get freed instantly and the allocations on the heap are not returned.
Apparently it seems like it's due to how malloc in glibc works

rust-lang/rust#60114

Not sure what we should do about it though, or if we should do anything about it at all.

@ErikBjare
Copy link
Member

Shouldn't be a problem then? And looks like we can't really do anything about it anyway.

@johan-bjareholt
Copy link
Member Author

Would be interesting to see how windows and mac handles this.

@johan-bjareholt
Copy link
Member Author

johan-bjareholt commented May 23, 2020

#118 seems to improve it slightly and #119 would likely pretty much fix it.

Seems like we are simply allocating/deallocating so much memory during a short period of time when doing queries that the allocator finds it to be a good idea to not release memory to the kernel. Reducing the amount of memory copying seems to improve things.

@johan-bjareholt
Copy link
Member Author

johan-bjareholt commented Jul 5, 2020

Another way to solve this is to use jemalloc (default for rust in 2018) instead of malloc (the default for rust today in 2020).

It works a bit differently, it grows quickly to around 200-300MB when doing many month queries for me, but after that it doesn't grow anymore but releases the memory to the kernel.
That it's 200-300MB is probably because of how many events I'm querying and since I'm not on the computer I spend the most time on I'm sure it will go higher on other databases with more events during a month.

Maybe we should do that until the root cause is fixed as a temporary solution? To do that we need to add the jemallocator dependency and 3 lines of code.

@ErikBjare
Copy link
Member

@johan-bjareholt Sounds good to me.

If you are consistently measuring significantly improved memory management by jemalloc then it sounds like a good idea in general.

@johan-bjareholt
Copy link
Member Author

Well, that didn't go as planned...

#144

warning: "jemalloc support for x86_64-pc-windows-msvc is untested"

Could test if the slow freeing of memory also occurs on windows, if it doesn't we could only enable jemalloc on Linux. I'd like to avoid platform specific dependencies, but I still think it's worth it.

@johan-bjareholt
Copy link
Member Author

I merged it only for Linux for now. #144

We'll see if it's an issue just for Linux or for the default allocators in Windows/macOS as well.

I'll close this issue and create a new issue for simply lowering the RAM usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants