Expiring TADDM Data - TADDM/taddm-wiki GitHub Wiki
Update:
- taddm_archive.jy updated on 12/18/2015. Major changes:
Only supported on v7.3
Threading added for performance
Ability to skip systems that failed discovery recently so avoid deletion of systems on remediation list
archive.ps1 added to support PowerShell on Windows
skip_zos argument added for customers running zDLA in delta mode
Added flag to print lastStoredTime
Added flag to use specific epoch instead of Age (in days)
Updated views for 7.3 to include new views for db calls
Updated to use upgraded Jython v2.5.3 in TADDM v7.3
Please ensure you download the correct version!
In most deployments, TADDM is considered to be a list of Active Systems. To maintain this integrity, it is necessary to implement a set of processes to ensure the data remains solid. Part of that process is to implement a discovery remediation process to resolve discovery problems and the other part is to remove dormant components.
Dormant Components in TADDM are machines that have not been discovered in a particular period of time, for example 90 days. Since TADDM collects and maintains all of the data it has discovered indefinitely, it is necessary to expire data in order to reduce the size and complexity of the database.
The script taddm_archive.jy can do this for you and will be discussed below – along with command line methods.
These scripts are minimally tested and provided as is as an example for how this can be done. It is highly recommended > that it be tested in your environment prior to relying on it in production.
-
Determine a time-frame. It should be a multiple of your discovery period. For example, if you discover your whole environment once a month, then expiring things after 90 days should mean that anything old was attempted at least 3 times and is most likely gone.
-
Attempt a Rediscovery – Use the '-s' option of taddm_archive.jy to export the systems older than your period (90 days in this example) to attempt a rediscovery:
bash# cd $COLLATION_HOME/custom bash# ./taddm_archive.jy -A 90 -C ComputerSystem -s OlderThan90 bash# ../bin/loadscope.jy -s “OldMachinesScope” load ./OlderThan90.txt
Now you should be able to go to your discovery server and run a discovery against this scope.
-
If necessary, archive the Remaining Machines – If you are concerned about losing any data, archive the remaining old machines when you delete them.
-
Delete the machines – Run taddm_archive.jy regularly (weekly) to handle your dormant machines. To archive and remove the dormant data, do the following
bash# cd $COLLATION_HOME/custom bash# ./taddm_archive.jy -A 30 -C ComputerSystem -S -E export -D
Objects in the TADDM database maintain a 'LastModifiedTime' which indicates the last time the object was discovered or any components of the object were changed. This attribute can be used to determine what objects have not been modified in a certain amount of time.
LastModifiedTime is in Milliseconds from January 1, 1970 (epoch). For comparison purposes, you can use
bash# now=$(date +%s) # The current time in seconds
bash# echo $now 1273777585
bash# 30_days_ago=$((now-30*24*3600)) # 30 days ago in seconds
bash# echo $30_days_ago 1271185585
All of the above numbers are in seconds versus Milliseconds. You need to multiply the above by 1000 for use in the following query
An example using the api.sh script to find all ComputerSystems that have not been modified in 30 days would be:
bash# $COLLATION_HOME/bin/sdk/api.sh -u user -p password find "select * from ComputerSystem where lastModifiedTime<'1271185585000'"
You could also use AppServer or some other ClassName in the above Query rather than ComputerSystem
To find out the objects in a class that are older than a specific age in days, you would do:
bash# cd $COLLATION_HOME/custom
bash# ./taddm_archive.jy -A 30 -C ComputerSystem
This will return a list of GUIDs, DisplayNames that have not been modified in 30 days.
It may be useful to save the systems that are old in XML format prior to deleting them. To do that, you would do the following:
Using the lastModifiedTime from above with the api.sh script to find all ComputerSystems that have not been modified in 30 days would be:
bash# $COLLATION_HOME/bin/sdk/api.sh -u user -p password find --depth 3 "select * from ComputerSystem where lastModifiedTime<'1271185585000'" >> export.xml
You could also use AppServer or some other ClassName in the above Query rather than ComputerSystem
Using the taddm_archive.jy script, you have two archive options:
-S – This saves each GUID at Depth 3 in an XML file in the current directory
-E – Gives you the option to specify the directory where you want this saved.
For example:
bash# cd $COLLATION_HOME/custom
bash# ./taddm_archive.jy -A 30 -C ComputerSystem -S -E export
This will save all ComputerSystems not modified in 30 days as .XML in ./export
Using the taddm_archive.jy script, you can:
-s ScopeName – This attempts to get the IP for all objects to be deleted and creates a file named ScopeName.txt that can be loaded with loadscope.jy
-E – Gives you the option to specify the directory where you want this saved.
For example:
bash# cd $COLLATION_HOME/custom
bash# ./taddm_archive.jy -A 30 -C ComputerSystem -s Olderthan30 -E scope
This will save all ComputerSystems not modified in 30 days into a file Olderthan30.txt in ./scope that can be loaded as a scope for re-discovery using dist/bin/loadscope.jy
Note that if disableDNSLookups is set to true (default is false) in collation.properties on the TADDM server where you are executing the script, then the -s scope generation will not work.
In order to delete the systems that are old you would do the following:
Using the xml file that was saved in the export, you would need to parse all of the GUIDs out and then delete them one by one using the following command:
bash# $COLLATION_HOME/bin/sdk/api.sh -u user -p password delete GUID
It is beyond the scope of this document to describe how to parse the xml and retrieve the GUIDs
The taddm_archive.jy script can delete the old objects using the following syntax:
bash# cd $COLLATION_HOME/custom bash# ./taddm_archive.jy -A 30 -C ComputerSystem -D -S -E export
It is not necessary to Save (-S -E export) the data prior to deleting it, However it is recommended if you are NOT doing regular database backups.
This will save all ComputerSystems not modified in 30 days as .XML in ./export and then also delete them from the system.
Download the following files to $COLLATION_HOME/custom. You may need to create the directory 'custom'. $COLLATION_HOME is by default /opt/IBM/taddm/dist.
- taddm_archive.jy
- Windows only: taddm_archive.bat
For non-Windows platforms, make the scripts executable by issuing the following command.
chmod a+rx taddm_archive.jy
Note that if disableDNSLookups is set to true (default is false) in collation.properties on the TADDM server where you are executing the script, then the -s scope generation will not work.
For Windows, you need to execute the taddm_archive.bat wrapper script for jython to work properly.
bash$ ./taddm_archive.jy --help
usage: script_name [options]
or: script_name [options] -H displayName|-A days [ -S -D -C -E dir]
Return Model Objects that have not been modifed in AGE Days. Optionally
Save and Delete these objects
Options:
-u userid User required to login to TADDM Server
Defaults to 'administrator'
-p password Password for TADDM Server user
Defaults to 'collation'
-h print this message
Arguments:
-A age Use Age in DAYS as time to consider something old for
ClassName (defaults to ComputerSystem)
An age of '0' will return all objects in the class.
-S Save Objects that are 'Old' in XML format
-s scope_name Create a scope named 'scope_name' of old machines
-H displayName -- Ignore age and use this specific displayname
-C Classname to examine -- Defaults to ComputerSystem -- a good
alternative would be AppServer
-L num Limit results to a maximum of num
-D Delete the objects that are considered OLD
-E dir Export Directory (must exist) defaults to current directory
-c Limit to Classname only and do not include the sub-classes --
For example, query ComputerSystem instances but not
WindowsComputerSystem instances
--l2orphans Use all L2Interface instances that are orphans irrespective of
age. -C option must be 'L2Interface'. -A option must be 0.
--chk_sups Check the superiors table to ensure that the instances are
not part of a naming rule
--skip_zos Skip all zOS components, recommended if running z/OS DLA in
delta mode
-q Display the SQL query used
--print_laststored Display LastStored in addition to GUID and DisplayName
--epoch=<time> Use specific epoch formated time instead of -A age option
--skip_failed=<weeks>
Skip deletion of components if there is a failed session sensor
matching the contexIp in the last <weeks> week(s).
-t num Number of threads to use for deletion, ignored if -D is not used,
Defaults to single thread
It is an important best practice to run data expiration on a regular basis. The following section describes some tips and provides some tools to get you started in automating data expiration.
Below you will find tools to assist you in automating data expiration.
- classes.txt - This file lists all of the CI classes to expire. The classes listed by default are best practice. You may need to add or remove classes to fit the requirements for your environment. Use # to add comments or comment out classes for specific runs.
-
archive.ps1/archive.sh - This script executes taddm_archive.jy for all of the CI classes in classes.txt according to best practices. It automatically creates sub-directories under %COLLATION_HOME%\custom for the classes and timestamp of the execution.
The default AGE variable is 182 and the default LIMIT variable is 1000000.
At the end of execution, e-mail notification is present (Notification.ps1 is called for Windows and mailx for Unix) to send an e-mail notification that the job finished. The script may need tweaked to get e-mail notification to work.
This script should not be considered final and will most likely require some tweaking for your requirements and environment. This is a good starting point for automating data expiration.
archive.sh has only been tested on Linux and may not work on AIX. Ensure that archive.sh is executable after downloading (chmod a+rx archive.sh) - archive.properties - Edit this file to set properties for archive.sh/ps1. Make sure that you set archive.skipzos=true if you are running zDLA in delta mode.
- Notification.ps1 - WINDOWS ONLY. This is a powershell script to send e-mail notifications via SMTP. Edit the script to set the default $from variable and $smtpServer variable for your environment.
For notification to work on Windows, you may need to install PowerShell if you are running on Windows 2003. Also, make sure that you run 'Set-ExecutionPolicy Unrestricted' for both the 32-bit and 64-bit powershell at the command prompt. This only needs done once.
Now simply use the Windows task scheduler or cron to schedule execution of archive.ps1 or archive.sh. It should be run as often as a full discovery is run.
When scheduling via the Windows task scheduler, you need to run the powershell command and the file argument.
Known Issues
- The log tracing for the script might not work. Check tomcat.log in some TADDM distributions.
- Processing BindAddress may hang the process or take too long to delete. Refer to the Dormant Data cleanup in TADDM blog entry section titled "TABLES TO USE DATABASE SQL TO CLEANUP" for a discussion of how to handle the initial clean up of the BindAddress table.
- zDLA deltas: The archive process assumes that all CI components are being discovered on a regular basis. If you are running the zDLA in delta mode, then it is not guaranteed that all CI components are being updated regularly and some components will appear dormant when they actually are not. For this reason, ensure that you use the skip_zos feature to skip all zOS components.
- It is possible that the -S option, if used, can cause OutOfMemoryException to occur. If this is the case then remove that option and ensure that you are doing proper database backups.
- Manually created components: If you have manually created components in TADDM that do not get updated regularly, those will eventually get cleaned up by this script. Keep that in mind.
- archive.sh fails on AIX: The script has only been tested on Linux so there may be some syntax errors on AIX. Try installing bash (and readlink) to see if that resolves the problem.
- Errors may occur when running on Windows due to a bug in dist\bin\jython_coll_253.bat. To resolve, look for %JYTHON_HOME%\lib\jython.jar and remove the lib reference so that it reads %JYTHON_HOME%\jython.jar.
- It is possible to see OutOfMemoryException, even without the -S option enabled. In this case, you may need to increase the memory available to the JVM. This can be done by adding -Xmx1024m to the last line of dist/bin/jython_coll_253[.bat].