Package org.jivesoftware.openfire.pubsub
Class PubSubSubscriptionMaintenance
java.lang.Object
org.jivesoftware.openfire.pubsub.PubSubSubscriptionMaintenance
Analyzes and (optionally) cleans up redundant rows in the
ofPubsubSubscription table.
Some installations have accumulated very large numbers of redundant subscription rows: rows that share the same node,
subscription JID, owner and subscription type, differing only by their generated subscription ID. On a node that does
not allow multiple subscriptions for the same subscription JID (XEP-0060 §6.1.6) - most notably PEP services
(XEP-0163) - at most one such subscription can be meaningful, so the surplus rows carry no functional value. In
extreme cases their sheer number exhausts the Java heap when the data is loaded into memory (OF-3306).
This utility is intended to be driven from an admin-console page. It offers three operations:
analyze()- a read-only assessment of how much redundant data exists (safe to call on page load).startCleanup()- launches a batched, background deletion of the redundant rows.getProgress()- a thread-safe snapshot of an in-progress or completed cleanup, for a progress bar.
What is deleted
For each group of rows that share(serviceID, nodeID, jid, owner, subscriptionType), exactly one row is kept
(the one with the lexicographically greatest id); all others in the group are removed. Groups with only a
single row are never touched. The deletion is performed in bounded batches, each in its own transaction, so that it
can run against a live server without producing an unmanageably large transaction.
Safety with respect to multiple-subscription services
Same-key rows are only redundant on a service that does not allow multiple subscriptions for the same subscription JID (XEP-0060 §6.1.6). On a service that does allow them, such rows are legitimate and are differentiated by their subscription ID; deleting them would destroy live subscriptions. Whether multiple subscriptions are allowed is a service-wide setting (PubSubService.isMultipleSubscriptionsEnabled()): PEP
services always return false, while the main pubsub service is governed by the
xmpp.pubsub.multiple-subscriptions property.
Because this deletion runs at the database level, it cannot itself consult that in-memory, per-service setting.
Instead the caller - which runs inside the server and can enumerate the live services - must supply the set of
service IDs that permit multiple subscriptions via the constructor. Those services are excluded from both the
analysis and the deletion, so their rows are never counted as removable and never deleted. Inverting the dependency
this way keeps the authority for the safety decision with the code that can actually answer the question, rather than
having this utility guess.
This utility performs no deletion until startCleanup() is explicitly invoked, and administrators should be
advised to take a database backup first.
Instances are not designed to run concurrent cleanups; startCleanup() guards against launching a second
cleanup while one is already running.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final classRead-only assessment of the redundant-row situation.static enumPhase of a cleanup operation.static final classImmutable snapshot of cleanup progress. -
Constructor Summary
ConstructorsConstructorDescriptionPubSubSubscriptionMaintenance(Collection<String> multipleSubscriptionServiceIds) Creates a maintenance utility that excludes the supplied multiple-subscription services from analysis and cleanup. -
Method Summary
Modifier and TypeMethodDescriptionanalyze()Performs a read-only assessment of the redundant-row situation.initialize(Collection<String> multipleSubscriptionServiceIds) Initializes the shared maintenance instance with the set of services that permit multiple subscriptions, if it has not already been created.static booleanReturns whether a cleanup is worth recommending, for an advisory on the admin index page.static voidsetCleanupAdvisable(boolean advisable) Directly sets the cached advisability value and marks it freshly checked.booleanLaunches a cleanup on a background thread, unless one is already running.
-
Constructor Details
-
PubSubSubscriptionMaintenance
Creates a maintenance utility that excludes the supplied multiple-subscription services from analysis and cleanup.- Parameters:
multipleSubscriptionServiceIds- the IDs of services for whichisMultipleSubscriptionsEnabled()is true. Rows belonging to these services are never counted as removable and never deleted. Must not be null; pass an empty collection only when the deployment is known to have no service that permits multiple subscriptions. In practice this set is very small (often just the single main pubsub service); it is rendered into a SQLINlist, so it is not intended to hold thousands of entries.
-
-
Method Details
-
isCleanupAdvisable
public static boolean isCleanupAdvisable()Returns whether a cleanup is worth recommending, for an advisory on the admin index page. Non-blocking: returns the cached value immediately and, if that value is missing or stale (and no run is in progress), schedules a one-off background refresh. A fullanalyze()can take many seconds on a very large table, so it must never run on the page-rendering thread; consequently the first index view after startup returnsfalseand the advisory may only appear on a later view, once the background check has completed.- Returns:
- the cached advisability flag;
falseuntil the first background check has completed.
-
setCleanupAdvisable
public static void setCleanupAdvisable(boolean advisable) Directly sets the cached advisability value and marks it freshly checked. Used by the cleanup worker on completion, when the outcome is already known, to avoid a redundant re-analysis.- Parameters:
advisable- whether a cleanup is now worth recommending.
-
initialize
public static PubSubSubscriptionMaintenance initialize(@Nonnull Collection<String> multipleSubscriptionServiceIds) Initializes the shared maintenance instance with the set of services that permit multiple subscriptions, if it has not already been created. Called by the pubsub module at startup, when the live services can be inspected.- Parameters:
multipleSubscriptionServiceIds- service IDs to exclude from analysis and cleanup; see the constructor.- Returns:
- the shared instance.
-
getInstance
- Returns:
- the shared maintenance instance, or null if it has not been initialized yet (the pubsub service has not started). Callers that need an instance before startup should treat null as "not yet available".
-
getExcludedServiceIds
- Returns:
- the service IDs excluded from analysis and cleanup (those permitting multiple subscriptions). Never null.
-
analyze
Performs a read-only assessment of the redundant-row situation. This issues a single aggregate query. On a very large table it can take some seconds (a full scan), but it neither locks rows for writing nor modifies any data, so it is safe to call when rendering an admin page.- Returns:
- the analysis result, never null.
- Throws:
SQLException- if the database could not be queried.
-
startCleanup
public boolean startCleanup()Launches a cleanup on a background thread, unless one is already running. The cleanup deletes redundant rows in batches (seeDELETE_BATCH_SIZE), committing after each batch and updatinggetProgress()as it goes. Control returns to the caller immediately; the admin page should pollgetProgress()to render a progress indicator.- Returns:
- true if a new cleanup was started; false if one was already running.
-
getProgress
- Returns:
- a snapshot of the current (or most recently completed) cleanup progress. Never null.
-