- Reduce auto-suspend to 60 seconds
- Reduce virtual warehouse size
- Ensure minimum clusters are set to 1
- Consolidate warehouses
- Separate warehouse by workload, requirement & not by domain
- Reduce query frequency
- At many organizations, batch data transformation jobs often run hourly by default. But do downstream use cases need such low latency? Check with business before set up the frequency.
- Only process new or updated data
- Ensure tables are clustered correctly
- Drop unused tables
- Lower data retention
- The time travel (data retention) setting can result in added costs since it must maintain copies of all modifications and changes to a table made over the retention period.
- Use transient tables
- Avoid frequent DML operations
- Ensure files are optimally sized
- To ensure cost effective data loading, a best practice is to keep your files around 100-250MB.
- To demonstrate these effects,
- If we only have one 1GB file, we will only saturate 1/16 threads on a Small warehouse used for loading.
- If you instead split this file into ten files that are 100 MB each, you will utilize 10 threads out of 16. This level parallelization is much better as it leads to better utilisation of the given compute resources
- Leverage access control
- Enable query timeouts
- Configure resource monitors
Wednesday, 10 September 2025
Snowflake - Cost Optimization
Subscribe to:
Post Comments (Atom)
Snowflake - Cost Optimization
Reduce auto-suspend to 60 seconds Reduce virtual warehouse size Ensure minimum clusters are set to 1 Consolidate warehouses Separate warehou...
-
Upon using the custom query in the source object, the mapping fails with the following error: The select query for read operation failed bec...
-
MD5 (Message Digest Function) is a hash function in Informatica which is used to evaluate data integrity. The MD5 function uses Message-Dig...
-
1. Basic Stored Procedure create or replace procedure proc_return_str() RETURNS VARCHAR LANGUAGE javascript EXECUTE AS CALLER AS ...
No comments:
Post a Comment