Managing System Memory
Because the in-memory databases resize dynamically as needed, it is difficult to determine when an engine is almost out of free memory. Several statistics can indicate issues and you can take several actions to free memory.
Total Shared Memory
Check the total shared memory for a server with the
print_blade_stats.py -Y
command. In the
following example, the maximum memory available is 21744, and a threshold
notification is sent when the available memory falls to 50 MB:
Sys Stats
---------
Monitoring Response Time Memory Pool
Interval Processing Threshold in millis in use Max Threshold
NodeId (seconds) Errors Avg Max (MB) (MB) (MB)
===================================================================================
1 5 0 70 500 3705 21744 50
You can change the threshold at which a notification is sent by editing the create_config.info question: SNMP:What is the system memory notification threshold in MB?.
Database Memory
Check the number of objects in the Subscriber, Activity, Balance Set, and Event databases. If many objects exist, it might mean that expired product offers, sessions, and balances have not been removed by the automatic cleanup operations. To view database statistics, use the following command:
print_blade_stats.py -B
If the maximum number of sessions is nearing, you can increase the number by changing the answer to this create_config.info question: What is the maximum number of active sessions?.
If a lot of sessions are open, several system configuration settings can be tuned to possibly reduce the number. See Automatic Clean Up Operations.
Automatic Clean Up Operations
The following operations occur to free up memory on all servers:
- Garbage Collection — A configurable, automatic
garbage collection process ensures that allocated memory is returned to the
available free memory pool. As objects in the database are deleted, reduced in
size, or moved because they increased in size, the holes left behind are tagged
for cleanup. The cleanup process runs when the percentage of memory that is
fragmented reaches a configured threshold, when the number of fragmentation
holes reaches a configurable threshold, and at a scheduled, configurable
interval. When garbage collection runs, the holes are consolidated into larger
blocks and returned to the free memory pool. The
create_config.info question that sets the garbage
collection interval and batch size is: Do you want to use the
default database garbage collection
settings?. To verify the garbage collection settings, view the
mtx_debg_log file or
mtx_config.xml file.
By default, garbage collection is triggered when the size of fragmentation reaches 11 percent of the total database segment size.
- Session cleanup — The following Task Manager configuration parameters affect session cleanup:
- Global:How long after the last RAR retry should the session be torn down?
- TaskMgr:At what interval (in seconds) should the activity database be scanned for operations that require processing?
- TaskMgr:How many activity database task messages should be sent per second per blade?
- TaskMgr:How many Activity database task messages should be sent per second per engine?
- TaskMgr:How many outstanding Activity database cleanup requests should be sent per blade before pausing?
- TaskMgr:How many outstanding Activity database cleanup requests should be sent per engine before pausing?
For information about the last RAR retry parameter value, see the discussion about global system configuration in MATRIXX Installation and Upgrade. For information about the Task Manager parameters values, see the discussion about Task Manager configuration in MATRIXX Installation and Upgrade.
- Expired purchased offer cleanup — By default, product offers that have expired, either because they have been canceled or their validity period has ended, are removed from the owning subscription, group, or device 45 days after they expire. The create_config.info question that sets this value is: Global:How long (in seconds) should expired offers be retained before being purged from the system?.
- Expired balance cleanup — By default, expired balances are removed from the owner's wallet 45 days after they expire. The create_config.info question that sets the value is: Global:How long (in seconds) should expired balances be retained before being purged from the system?.
- Event cleanup — The Task Manager scans the event database to initiate the removal of old event detail records (EDRs). Several questions guide the cleanup
operation:
- TaskMgr:How long (in micros) should the event cleanup scanner pause between each scan?
- TaskMgr:How many outstanding event cleanup messages should be sent per blade before pausing?
- TaskMgr:How many outstanding event cleanup messages should be sent in the engine before pausing?
- TaskMgr:What is the maximum number of events that can be deleted in a single transaction?
- TaskMgr:How large (in bytes) should the event database grow before events are deleted?
The last question, TaskMgr:How large (in bytes) should the event database grow before events are deleted?, sets the event cleanup threshold. During system configuration, the create_config.py script verifies that this threshold value is less than the difference between the maximum event database size plus one extended data segment size. If the event cleanup threshold is more than the allowed value, the create_config.py exits with an error similar to the following error:
In such cases, administrators must change the answer to the last question to a value that is less than or equal to the value indicated in the error message.Error: based on event database configuration (maxSize=65127055360 bytes), event cleanup purge threshold cannot be more than 65063454720 bytes
- EVENT_REQUEST record cleanup — EVENT_REQUEST records are stored in the activity database in case a one-shot usage event fails, for example, an SMS could not be sent due to a network failure. In such cases, the associated EVENT_REQUEST record is required to refund the account. The default is one day. The create_config.info question that sets the value is: Global:How long (in seconds) should EVENT_REQUEST records be retained before being purged from the system?
For more information, see the discussion about the Task Manager configuration parameters in MATRIXX Installation and Upgrade.
Manual Cleanup Operations
If the available memory is nearing the low threshold, you can perform the following operations to increase the available amount:- Restart servers on a regular basis to clean up stale processes and memory.
- Increase the size of the total system memory. The
system uses shared memory for all databases, queues, and other internal
structures. The create_config.info question that sets the
maximum sizes of these structures is:
What is the shared memory size in MB to use?