Cleaning OpenResty Edge Log Server Database Data

The OpenResty Edge Log Server database primarily stores error logs, WAF logs, and dynamic metric data. Among these, WAF logs and dynamic metric data typically occupy significant space. All these data are stored using a time-series database. This article will detail how to adjust timescale database policies and perform data cleanup.

Default Time-Series Table Overview

Table NamePurposeChunk IntervalData Retention Period
errlog_tsdbError logs7 days1 month
waf_request_tsdbWAF hit requests7 days1 month
waf_matches_tsdbWAF hit details7 days2 months
eventsEvent records7 days1 month
cc_log_tsdbRate limiting request information1 day7 days
_dymetrics_Dynamic metric data3 days1 month

Database Operation Steps

Logging into the Database

/usr/local/openresty-postgresql12/bin/psql -h 127.0.0.1 -p 5432 -U postgres -d or_edge_log_server

Querying Time-Series Table Size

Basic query:

SELECT table_name, table_size, total_size
FROM timescaledb_information.hypertable
WHERE table_schema='public' AND total_size IS NOT NULL;

Enhanced readability query:

SELECT table_name, table_schema, pg_size_bytes(total_size) / 1024 / 1024 AS total_size_mb
FROM timescaledb_information.hypertable
WHERE total_size IS NOT NULL
ORDER BY total_size DESC;

Viewing Time-Series Database Cleanup Policies

SELECT * FROM timescaledb_information.drop_chunks_policies;

Example output interpretation:

FieldExample ValueDescription
hypertableerrlog_tsdbName of the hyper table to which the cleanup policy is applied
older_than(t,“1 mon”,)Delete data chunks older than 1 month
cascadefDo not cascade delete dependent objects
job_id1000Associated background job ID
schedule_interval1 dayPolicy execution frequency
max_runtime00:05:00Maximum runtime for a single job
max_retries-1Maximum retry attempts after failure (-1 means unlimited retries)
retry_period00:05:00Retry interval
cascade_to_materializationsfDo not cascade delete materialized views

Querying Chunk Intervals

SELECT b.table_name,
 (a.interval_length / 24 / 60 / 60 / 1000000)::text || ' day(s)' AS interval
FROM _timescaledb_catalog.dimension AS a
LEFT JOIN _timescaledb_catalog.hypertable AS b ON b.id = a.hypertable_id;

Optimizing Time-Series Database Strategies

Note: The chunk interval should be at most half of the data retention time. Using errlog_tsdb as an example, here’s how to adjust the policy to retain 14 days of data with a 2-day chunk interval.

Steps:

  1. Remove existing cleanup policy:
  SELECT remove_drop_chunks_policy('errlog_tsdb', if_exists => TRUE);
  1. Add new cleanup policy:
  SELECT add_drop_chunks_policy('errlog_tsdb', INTERVAL '2 week', cascade_to_materializations => FALSE);
  1. Modify chunk interval:
  SELECT set_chunk_time_interval('errlog_tsdb', INTERVAL '2 day');
  1. Clean up expired data:
  SELECT drop_chunks(table_name => 'errlog_tsdb', older_than => INTERVAL '2 week', cascade_to_materializations => FALSE);

By following these steps, you can effectively reduce the disk space occupied by the OpenResty Edge Log Server database, ensuring the availability of the Log Server service.