Check and set analyze statistics on tables-columns in Vertica

It is important to update statistics in any database for better query runs or query optimizations.  Vertica provides ANALYZE_STATISTICS and ANALYZE_HISTOGRAM to collect stats and in big data warehouse environment it is sometime possible that some tables may not have stats collected.  In that case you can run following query to check and add the tables/ columns to collect stats.



SELECT T1.table_schema, T1.table_name, T2.row_ct, T2.wos_ct, T2.ros_ct
FROM
(
-- Get list of tables that don't have 'FULL' stats
SELECT distinct table_schema, table_name
FROM projection_columns
WHERE (statistics_type = 'ROWCOUNT' or statistics_type = 'NONE')
-- and table_schema not like 'v_%'
AND (table_schema = 'schema_1' OR table_schema = 'schema_2' )
ORDER BY 1,2
) T1
JOIN
(
-- Get list of tables with storage details
SELECT anchor_table_schema, anchor_table_name,
sum(row_count) row_ct, sum(wos_row_count) wos_ct, sum(ros_row_count) ros_ct
FROM projection_storage
WHERE (anchor_table_schema = 'schema_1' OR anchor_table_schema = 'schema_2' )
GROUP BY 1,2
) T2
ON T1.table_schema = T2.anchor_table_schema
AND T1.table_name = T2.anchor_table_name
ORDER BY 3 desc;

From the list generated above add those tables/ columns to your regular stats collecting script which you can run during off peak hours.

Notes:
Vertica’s ANALYZE_HISTOGRAM returns similar data as ANALYZE_STATISTICS except that one can set the percent of data that is sampled to generate statistics metrics.  In the case of latter it uses fixed 10% sampling.

WOS: Write Optimized Store is a memory resident data structure used to support fast data loads without data compression and indexing.
ROS: Read Optimized Store is disk-based storage with data compressed and sorted.  Tuple Mover moves data out of WOS and into ROS in batches.

References:

3 thoughts on “Check and set analyze statistics on tables-columns in Vertica

  1. In any case, today's dynamic business environment with association of various IT outsourcing organizations, vast measure of computerized information exchange and capacity requires a skilled data security framework that would include the components of hazard examination and hazard administration. https://goo.gl/KS7Bj0

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s