Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are the best practices for ord as a service? #3967

Open
ArthurQiuys opened this issue Sep 29, 2024 · 9 comments
Open

What are the best practices for ord as a service? #3967

ArthurQiuys opened this issue Sep 29, 2024 · 9 comments

Comments

@ArthurQiuys
Copy link

In case of errors, database files often get corrupted and need to be re-indexed.

@raphjaph
Copy link
Collaborator

We use systemd you can have a look at our service files in deploy directory. You just have to watch out to properly shut down ord before attempting to move or rename the index.redb file.

@emilcondrea
Copy link
Contributor

I confirm that it crashes and recovery takes very long time. The weird thing is that even if its at the tip, doing nothing, just querying bitcon node, it corrupts the db if it crashes.

I am wondering if its related on how the index is opened even there is nothing to do.

@raphjaph
Copy link
Collaborator

raphjaph commented Oct 1, 2024

How does it crash? How are you shutting it down? Or does it just randomly crash?

@emilcondrea
Copy link
Contributor

Its not the shutdown scenario, if I recall correctly it crashed because of OOM, which definitely means memory settings need to be tweaked on my end, but the thing I wanted to raise is that indexer should not corrupt the database if its just noop-ing querying bitcoin node.

@ArthurQiuys
Copy link
Author

What is the appropriate memory size to set? I want to know how many concurrent queries an ord with 32g memory can handle.

@raphjaph
Copy link
Collaborator

raphjaph commented Oct 2, 2024

What is the appropriate memory size to set? I want to know how many concurrent queries an ord with 32g memory can handle.

Only the initial indexing takes a lot of memory. Once the index is built it can handle quite a lot of requests. We have a server that handles 5 million requests per day and sits at less than 5% CPU usage normally. And at about 70% RAM usage with 128gb of RAM.

@raphjaph
Copy link
Collaborator

raphjaph commented Oct 2, 2024

Its not the shutdown scenario, if I recall correctly it crashed because of OOM, which definitely means memory settings need to be tweaked on my end, but the thing I wanted to raise is that indexer should not corrupt the database if its just noop-ing querying bitcoin node.

To build the full index with --index-sats you need at least 64gb of memory. Once the initial indexing is done corruption is very unlikely since that can only happen while flushing the cache to disk and in idle mode (following chain tip) that takes less than a second.

@ArthurQiuys
Copy link
Author

What is the appropriate memory size to set? I want to know how many concurrent queries an ord with 32g memory can handle.

Only the initial indexing takes a lot of memory. Once the index is built it can handle quite a lot of requests. We have a server that handles 5 million requests per day and sits at less than 5% CPU usage normally. And at about 70% RAM usage with 128gb of RAM.

Can systemd automatically restart in case of unexpected situations such as oom? Currently, the db cannot be recovered after the unexpected exit we have encountered, and can only be re-indexed

@victorkirov
Copy link
Contributor

For @emilcondrea 's point about the index being corrupted, this has happened to us a few times. The issue is that the index updater requires a write transaction to run the update function:

pub(crate) fn update_index(&mut self, mut wtx: WriteTransaction) -> Result

This is required even if the Ord index is already at tip. Since the updater runs quite often, if there is a critical crash of any kind, the index will be corrupted due to that write transaction being open.

Instead, the updater could get the tip indexed height and the current tip height from the btc node (and hashes to check for reorg) before it opens the write transaction. If the index is already at tip and no reorg occurred, then the loop can just continue without ever opening the write transaction and risking the index corruption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: To Do
Development

No branches or pull requests

4 participants