Plan for Deleting Old Data from subscribers_subscriber Table

Given the large number of records and the configured logical replications, the best approach may be to delete unnecessary data from the old table without locking the table or affecting the system’s performance. This will allow reducing the table size without major impact on resources or logical replication.

1. Create a Backup of the Table

Before deleting data, it is essential to create a backup of the table to avoid data loss:

Create a Backup Table:

CREATE TABLE subscribers_subscriber_backup (LIKE subscribers_subscriber INCLUDING ALL);

Create indexes on the backup table (if needed) based on the original table indexes:

CREATE INDEX subscribers_subscriber_pkey_backup ON subscribers_subscriber_backup (id);
 
CREATE INDEX subscribers_channel_c230ff_idx_backup ON subscribers_subscriber_backup (channel_id, is_active);
 
CREATE INDEX subscribers_is_acti_0636ea_idx_backup ON subscribers_subscriber_backup (is_active, owner_id);
 
CREATE INDEX subscribers_is_acti_1611fa_idx_backup ON subscribers_subscriber_backup (is_active, channel_id, apns_app);
 
CREATE INDEX subscribers_is_acti_731e4f_idx_backup ON subscribers_subscriber_backup (is_active, deliveries_idle_days);
 
CREATE INDEX subscribers_is_acti_95aeb0_idx_backup ON subscribers_subscriber_backup (is_active, clicks_idle_days);
 
CREATE INDEX subscribers_is_acti_ab1e1c_idx_backup ON subscribers_subscriber_backup (is_active, channel_id, city);
 
CREATE INDEX subscribers_is_acti_de05b6_idx_backup ON subscribers_subscriber_backup (is_active, channel_id, firebase_app);
 
CREATE INDEX subscribers_is_acti_f9f94e_idx_backup ON subscribers_subscriber_backup (is_active, channel_id, country);
 
CREATE INDEX subscribers_subscri_f857df_idx_backup ON subscribers_subscriber_backup (subscription_hash_str, channel_id);
 
CREATE INDEX subscribers_subscriber_apns_app_970b22ce_backup ON subscribers_subscriber_backup (apns_app);
 
CREATE INDEX subscribers_subscriber_channel_id_27deaa63_backup ON subscribers_subscriber_backup (channel_id);
 
CREATE INDEX subscribers_subscriber_channel_id_is_active_deleted_idx_backup ON subscribers_subscriber_backup (channel_id, is_active, deleted);
 
CREATE INDEX subscribers_subscriber_deleted_93c811c4_backup ON subscribers_subscriber_backup (deleted);
 
CREATE INDEX subscribers_subscriber_firebase_app_880e1e53_backup ON subscribers_subscriber_backup (firebase_app);
 
CREATE INDEX subscribers_subscriber_id_channel_id_is_active_idx_backup ON subscribers_subscriber_backup (id, channel_id, is_active);
 
CREATE INDEX subscribers_subscriber_is_active_6a4002a0_backup ON subscribers_subscriber_backup (is_active);
 
CREATE INDEX subscribers_subscriber_is_ghost_1bd828e9_backup ON subscribers_subscriber_backup (is_ghost);
 
CREATE INDEX subscribers_subscriber_is_iddle_a73f688c_backup ON subscribers_subscriber_backup (is_iddle);
 
CREATE INDEX subscribers_subscriber_owner_id_fc68769f_backup ON subscribers_subscriber_backup (owner_id);
 
CREATE INDEX subscribers_subscriber_subscription_hash_769f1d5e_backup ON subscribers_subscriber_backup (subscription_hash);
 
CREATE INDEX subscribers_subscriber_subscription_hash_str_c55c81dd_backup ON subscribers_subscriber_backup (subscription_hash_str);
 
CREATE INDEX subscribers_subscriber_subscription_hash_str_c55c81dd_like_backup ON subscribers_subscriber_backup (subscription_hash_str varchar_pattern_ops);
 
CREATE INDEX subscribers_subscriber_token_58c9e144_backup ON subscribers_subscriber_backup (token);
 
CREATE INDEX subscribers_subscriber_token_58c9e144_like_backup ON subscribers_subscriber_backup (token varchar_pattern_ops);
 
CREATE INDEX subscribers_subscriber_unsubscribed_date_768f1d7d_backup ON subscribers_subscriber_backup (unsubscribed_date);
 
CREATE INDEX subscribers_subscriber_uuid_b5412fac_backup ON subscribers_subscriber_backup (uuid);
 
CREATE INDEX subscribers_token_19b479_idx_backup ON subscribers_subscriber_backup (token, channel_id);

Copy Data to the Backup Table:

INSERT INTO subscribers_subscriber_backup
SELECT *
FROM subscribers_subscriber;

This will create a full backup of the table.

2. Delete Old Data in Batches (Batch Deletes)

To minimize locking and reduce database load we shoud do delete the data in batches rather than all at once.

Batch Deletion (deleting 10,000 records at a time):

DO $$
DECLARE
    batch_size INT := 10000;
    rows_deleted INT;
BEGIN
    LOOP
        WITH del AS (
            DELETE FROM subscribers_subscriber
            WHERE ctid IN (
                SELECT ctid FROM subscribers_subscriber
                WHERE subscribed_date < '2024-01-01'
                AND WHERE is_active = false
                LIMIT batch_size
            )
            RETURNING *
        )
        SELECT COUNT(*) INTO rows_deleted FROM del;
 
        EXIT WHEN rows_deleted = 0;
 
        PERFORM pg_sleep(1);
    END LOOP;
END $$;

What will be done:

  • Deletes records in batches (batch_size = 10,000).
  • Runs in a loop until all records are deleted.
  • Uses pg_sleep(1) to introduce a short delay between batch deletions to avoid system overload.
  • This approach minimizes full table locking and reduces load on logical replication.

3. Clean up the Table (VACUUM) (could be done later after all data is deleted during off-peak hours or maintenance window)

After deleting old records, it’s necessary to free up space by vacuuming the table:

Option 1: Full Vacuum (locks the table):

VACUUM FULL subscribers_subscriber;

VACUUM FULL reclaims space but locks the table.

Option 2: Regular Vacuum (doesn’t lock the table):

VACUUM subscribers_subscriber;

A regular VACUUM won’t lock the table but may leave some space unreclaimed. It is less disruptive and can be run during normal operations.

Benefits of this Approach:

  1. No Table Locking: Deleting data in batches minimizes locking impact on services using the table.
  2. Minimal Load on the Database: Using LIMIT and pg_sleep() reduces peak load and prevents system overload.
  3. Data Safety: A backup of the table ensures that data can be restored if necessary.