Hey everyone,
I recently ran into an interesting challenge with Pruning Issues with SHA Hashing on Snowflake and wanted to get your thoughts.
The issue is that pruning on micro-partitions can become less effective when using SHA hashes. This is because hashed values, and even string-based keys, are uniformly distributed, making it harder for Snowflake to efficiently prune irrelevant data.
One potential workaround could be including the business key in the satellite table if it’s numeric and prone to better pruning. But this should only be done if the satellite is frequently joined or causing performance issues to avoid redundancy. Another idea is to define cluster keys on frequently filtered columns to improve efficiency.
Snowflake’s performance is generally solid, so this might not be a major problem, but I’m curious—have you encountered this? What solutions have you tried?
Looking forward to your thoughts!