-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Disk and memory-backed cache with Foyer 0.10. #2890
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
The lfu/lru/weighted_lfu caches didn't initialize a hybrid cache. This PR rewrites the chroma_cache crate to hide implementation details of the cache as much as possible. Noteworthy callouts from this PR: - We don't currently support cleaning up of tempfiles evicted from cache. This is status-quo. - cops-disk-cache-config-writer and cops-memory-cache-config-writer provide command-line arguments to generate the YAML for the Foyer cache. - Cache<K, V> is a trait. PersistentCache<K, V> allows for requiring persistence. All PersistentCache<K, V> can be cast to Cache<K, V>.
dae0120
to
a1b7cf3
Compare
fn try_from(b: &Block) -> Result<Self, Self::Error> { | ||
Ok(CachedBlock { | ||
id: b.id, | ||
data: b.to_bytes()?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think from_bytes/to_bytes will result in copies of the data, do we care?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do. I'm debating on a two-level cache for persistent and non-persistent cache to avoid this problem, but I don't know how we'll do that.
@@ -34,8 +34,8 @@ impl ArrowBlockfileProvider { | |||
pub fn new( | |||
storage: Storage, | |||
max_block_size_bytes: usize, | |||
block_cache: Cache<Uuid, Block>, | |||
sparse_index_cache: Cache<Uuid, SparseIndex>, | |||
block_cache: Box<dyn Cache<Uuid, CachedBlock>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that all cache accesses will use to_bytes/from_bytes? That seems not super great since its in the hot path?
} | ||
|
||
pub(super) async fn get(&self, id: &Uuid) -> Result<Option<Block>, GetError> { | ||
let block = self.block_cache.get(id); | ||
let block = self.block_cache.get(id).await.ok().flatten(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means every access to the cache results in a copying of the data in the block, which means basically all queries copy potentially 100's of MB of data around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I agree that it seems like every cache access converts bytes to blocks which copies data (a block is 8MB so 8MB per access) and vice versa (unless I am missing something which is also likely)
@@ -196,8 +197,9 @@ impl BlockManager { | |||
pub(super) fn new( | |||
storage: Storage, | |||
max_block_size_bytes: usize, | |||
block_cache: Cache<Uuid, Block>, | |||
block_cache: Box<dyn Cache<Uuid, CachedBlock>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Why take a Box and turn it into an Arc instead of just taking an Arc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interface for creating creates a box. Consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad for not noticing
|
||
use chroma_cache::{CacheConfig, FoyerCacheConfig}; | ||
|
||
fn main() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: comment that this takes the default config and generates the yaml would be helpful
|
||
use chroma_cache::{CacheConfig, FoyerCacheConfig}; | ||
|
||
fn main() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit / thought: might be nicer to have one binary that takes a flag for which config it generates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose to make it this way as it's clear from the tool which output you'll get. This allows us to have one tool per enum variant with exactly the options said variant needs. I'm not wedded to this idea, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good with me
use super::{CacheError, Weighted}; | ||
|
||
/// A zero-configuration cache that doesn't evict. | ||
pub struct NopCache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is unused? If so, should we remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's wired up. I think being able to stub out the cache is useful, so I'd prefer to leave it. You have to set cache config to nop to use it, so the risk of inadvertently using it is low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good with me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good to me, my primary concern is to do with to_bytes/from_bytes on the CachedBlock impl. @sanketkedia wdyt?
pub(super) fn cached(&self, id: &Uuid) -> bool { | ||
self.block_cache.get(id).is_some() | ||
pub(super) async fn cached(&self, id: &Uuid) -> bool { | ||
self.block_cache.get(id).await.ok().is_some() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this panic in case of errors? Should we instead catch the error and surface it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.ok().is_some()
should be written .is_ok()
. It will not panic.
/// Deserialize. | ||
#[derive(Clone, Debug)] | ||
#[repr(transparent)] | ||
pub struct RecordBatchWrapper(pub RecordBatch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
love this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments on nits / potentially unused code that would be good to look at. Seems great otherwise! Thank you for chasing the solution
Description of changes
The lfu/lru/weighted_lfu caches didn't initialize a hybrid cache. This PR rewrites the chroma_cache crate to hide implementation details of the cache as much as possible. Noteworthy callouts from this PR:
Test plan
How are these changes tested?
pytest
for python,yarn test
for js,cargo test
for rustDocumentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs repository?