| DBAmon |
|
Home |
Index / DBAmon Documentation |
|
|
Event Message Format - Example: DBA210W
Full Error Fail Consistency Check failed PANIC
dynamically allocated new shared memory segment
ALTER TABLE OWNER.TABLE STORAGE ( NEXT ?M );... so that the next extent size is less than the largest freespace area.
-Or
SQL Server has encountered n occurrence(s) of I/O requests taking longer than 15 seconds to complete on file... that were found during the most recent batch of SQL Log messages that were scanned.in database
A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 32992, committed (KB): 64696, memory utilization: 50%.This means that Lock Pages in Memory is not enabled.
Some good articles on TEMPDB:
Corrective Action:
Add TEMPDB files to get to the optimum DB file count mentioned in this message. Make
sure that they are of a uniform size and that the size is preallocated.
A high number of SQL Virtual Log Files (VLFs) can cause DB update problems and can elongate DB recovery time. Having a high number of VLFs is caused
by having a GROWTH setting on the TLOG with a low growth increment.
Here are some good articles:
A good script to consolidate VLFs can be found in the ADVENTURESINSQL article in the URLs above.
The steps:
This event is telling you that you are using the out-of-the-box default instance setting for "Max Server Memory (MB)".
This means that MSSQL is free to grab as much memory as it needs from the OS for the MSSQL Buffer Pool. Although it
may be a matter of opinion, it is a good practice to deliberately set the MSSQL Max Server Memory setting, rather than
let it default. As with any DBAmon event, you can suppress it with the Suppress_Events:
DBC parameter if you disagree.
Corrective Action:
I suggest Googling "sql server max server memory". You will see MANY hits.
Our pratice on a server that is ONLY running MSSQL is to leave 500MB-750MB for Windows, and up to 1GB for other MSSQL
memory (The max server memory parameter only specifies the size of the buffer pool), and then give what's left to max server memory.
Of course if your server is running other applications in addition to MSSQL, your mileage will vary (you will have to reserve some
memory for your app).
DBA858 - MSSQL Instance SIGNAL_WAIT_PCT=$a (Threshold=$b%),Signal-Wait-${this_signal_wait_pct}%
Description:
This performance event indicates that the current SQL SIGNAL WAIT % value exceeds the threshold (defaults to 25%).
A good internet article explaining the concept can be found:
Here
(Pinal Dave's SQLAuthority WWW site).
Corrective Action:
A sustained high SQL SIGNAL WAIT % indicates a need for additional CPUs.
DBA859 - $zvlf_event_cnt MSSQL Database(s) VLF Count >= DBC Threshold: $t_mssql_vlf_count[$thishost]","VLF-Count"
Description:
Corrective Action:
DBA860 - "MSSQL Instance Max Memory ($F3) is set to INFINITE ($sql_config_msm)","MaxMemINFINITE"
Description:
DBA861 - Database(s) Found With GOOD-For-Performance Options Set to OFF
Description:
DBAmon has found that there are one or more MSSQL databases which do not have OPTIONS specified which are good for MSSQL Performance.
For example, the AUTO CREATE STATISTICS option is known to be good for performance. So, if this OPTION is not specified, then this
event will occur.
Corrective Action:
Set the options mentioned in this event to ON or TRUE. If you have a good reason to have these turned off, then of course you can
suppress this event by specifying it in the "Suppress_Events" parameter of the DBC file for this instance.
DBA901 - "Oracle/NT Not Active - Status=$dbstatus"
Description:
Oracle is down.
Corrective Action:
Start Oracle.
DBA903 - "Oracle/NT Listener Not Active - Status=$lsnr_status"
Description:
DBAmon attempted to run 'tnsping $ORACLE_SID'. It failed.
Corrective Action:
Determine the cause of the problem and fix it!
DBA905 - "Connect Logic Error - Probe dbamon_orant.pl Did Not Finish"
Description:
A DBAmon probe module failed.
Corrective Action:
Examine accompanying messages. Contact
DBAmon Support.
DBA909 - "DBAMON.TIMESTAMP Has > $timestamp_rowlimit rows ($timestamp_rows[$thishost])"
Description:
The number of rows in this table exceeds the threshold. The purge process must not be working.
Corrective Action:
Contact BB.
DBA910 - "db_block_buffer Read Hit Ratio of $bufhitratio[$thishost] < threshold of $t_readhit[$thishost]"
Description:
The Oracle db_block_buffer hit ratio (specifed in message) is below the threshold specified for this instance by
the T_Read_Hit: DBC parameter (or the default). This DB is performing poorly.
Corrective Action:
Increase the db_block_buffers init.ora parameter.
DBA911 - "db_block_buffer Read Hit Ratio of $bufhitratio[$thishost] is invalid (< 0 or > 100)"
Description:
While examining the Oracle db_block_buffer hit ratio, the value was found to be
invalid. Check the accompanying text to see if some error occured while querying
the Oracle dictionary.
Corrective Action:
Solve the problem which was causing the query to return invalid data.
DBA912 - RMANHUNG: Found rman process (pid=$) that has been running for $ days (threshold=2 days) !!!
Description:
A Unix process containing the string rman was found to have been running for
more than the THRESHOLD-DAYS number of days long. It is probably a dead process
which may be consuming resources even though it is not doing anything useful.
Corrective Action:
Kill the hung process at the UX level. If RMAN was invoked from a backup script,
also make sure that you kill the script.
DBA913 - $num_otrace[$thishost] ORACLE_HOME/otrace/admin/*.dat Files Found - OTRACE Is ON Which Causes Performance Problems!!!
Description:
OTRACE is on for this DB because .dat files were found in ORACLE_HOME/otrace/admin.
OTRACE can be bad for performance, so it should be turned off. You turn it off by
rm'ing the .dat files in ORACLE_HOME/otrace/admin and restarting Oracle.
Corrective Action:
Turn off OTRACE. You turn it off by rm'ing the .dat files in ORACLE_HOME/otrace/admin and restarting Oracle.
DBA914 - Instance SQL_TRACE=TRUE - This Causes Performance Problems!!!
Description:
SQL_TRACE set to true at the instance level will cause serious performance problems.
Corrective Action:
Turn off SQL_TRACE. You turn it off by running ALTER SYSTEM SET SQL_TRACE=FALSE And/Or removing this setting
from init.ora.
DBA915 - $dfltsys_cnt DB Users Found With DEFAULT TABLESPACE Set To SYSTEM !!!
Description:
DBAmon found DB users whose default tablespace is SYSTEM. This is very bad for performance.
Corrective Action:
Alter the user so that their default tablespace is not SYSTEM.
DBA916 - $tempsys_cnt DB Users Found With TEMPORARY TABLESPACE Set To SYSTEM !!!
Description:
DBAmon found DB users whose temporary tablespace is SYSTEM. This is very bad for performance.
Corrective Action:
Alter the user so that their temporary tablespace is not SYSTEM.
DBA917 - $tempperm_cnt DB Users Found Whose TEMPORARY TABLESPACE Is a PERMANENT Tablespace !!!
Description:
DBAmon found DB users whose temporary tablespace is permanent tablespace. This is very bad for performance.
Corrective Action:
Alter the user so that their temporary tablespace is a TEMP tablespace, or alter their temp
tablespace to be a type=TEMP tablespace.
DBA918 - *** oraUp() DBA_REGISTRY Shows that component (Oracle9i Catalog Views ) is at version 9.2.0.2.0 which is less than DB engine version 9.2.0.3.0(64) - rg_status=VALID ***
Description:
Starting in 9i, the Oracle dictionary catalog contains components that are registered
products. This event can also occur for DB internal components like Java. The event
means that the version of the product mentioned is lower than the version of the
DB engine. What probably happened is that the DB was upgraded within the same version
(9i for example) and catpatch was not run.
Corrective Action:
Run:
Next, verify that the versions of the internal components match the DB engine version.
Run orastat -rg.
DBA919 - MTS is being used for this Non-RAC/OPS instance - Bad for performance - mts_queue=$mts_queue[$thishost]
Description:
This instance is not using RAC or OPS, but MTS is being used. This can cause major
performance problems. So, it would be best to only use DEDICATED SERVER connections.
Corrective Action:
An easy way to effectively disable MTS is to set USE_DEDICATED_SERVER=ON in sqlnet.ora.
DBA920 - TEMP Tablespace is a PERMANENT Tablespace (It Should Be TEMPORARY) !!!
Description:
DBAmon found that your tablespace named TEMP is a permanent tablespace.
This can be very bad for performance. Any user which has this tablespace
specified for its TEMPORARY TABLESPACE will perform poorly when doing disk
sorts.
Corrective Action:
Alter the TEMP tablespace so that it is a type=TEMP tablespace. SQL:
ALTER TABLESPACE TEMP TEMPORARY;
DBA921 - Library Cache Hit Ratio is $libhitratio[$thishost]% (Should Be >= 90%) - Increase shared_pool_size !!!
Description:
The Oracle library cache hit ratio was found to be < 90%.
Corrective Action:
Increase the init.ora shared_pool_size parameter.
DBA922 - Dictionary Cache Hit Ratio is $dicthitratio[$thishost]% (Should Be >= 90%) - Increase shared_pool_size !!!
Description:
The Oracle dictionary cache hit ratio was found to be < 90%.
Corrective Action:
Increase the init.ora shared_pool_size parameter.
DBA923 - Rollback Segment Header Waits Are Too High (Gets/Waits Ratio > 1.00%) - Add Rollback Segments
Description:
The total number of Rollback Segments Header Waits exceed 1% of the total number of Rollback
Segment Header Gets.
Corrective Action:
Add more rollback segments.
DBA924 - There are fewer than 10 Free DB Cache Buffers (free_buffers=$free_buffers[$thishost]) - It would be beneficial to increase DB Cache Size
Description:
The total number of FREE DB Cache buffers is less than 10. So, this DB would probably benefit
from you specifying a large DB buffer cache.
Corrective Action:
If you have enough unused memory, increase the size of the DB Buffer Cache.
DBA925 - UNDO_MANAGEMENT Is not set to AUTO - It should be - undo_mgmt=$undo_mgmt[$thishost]
Description:
In Oracle 9i and higher, SMU (System Managed UNDO) is a GOOD THING. It should always be on.
For this 9i+ instance, it is not turned on.
Corrective Action:
Turn on SMU. This will require DB downtime.
DBA926 - FORCE LOGGING Should be turned on - force_logging=$force_logging[$thishost]
Description:
There is a very nice feature in Oracle 9 and higher called FORCE LOGGING. If this DB option is ON, then
NOLOGGING operations are all automatically disallowed.
Corrective Action:
Run: ALTER DATABASE FORCE LOGGING
DBA927 - ? Dictionary Objects have been ANALYZED - This is bad for performance
Description:
Analyzing the Oracle Dictionary can cause some very serious and hard to diagnose performance problems.
One symptom is high "recursive CPU" in a statspack report. If any SYS objects other than DUAL or
PLAN_TABLE have been analyzed, this event will occur.
Corrective Action:
Remove the analyze data for the SYS objects.
Run: execute dbms_stats.delete_schema_stats('SYS');
DBA928 - DB Cache is only 1 granule in size - It was probably underspecified - granule_size=$granule_size[$thishost] db_cache _size=$db_cache_size[$thishost]?
Description:
It was determined that this 9i or higher instance has a DB Buffer Cache that is only 1 granule
in size. This can occur when Oracle sees that you have specifed a db_cache_size that is less
than 1 granule. In this case Oracle will round up to 1 granule.
Corrective Action:
It is better to intentionally specify the cache size. And the default is 48M, so a good
minimum db_cache_size is 50M. Change the init.ora file and bounce the instance.
DBA929 - Server Memory is $hw_memory_pct_full[$thishost] used (t=$hw_memory_threshold_w/$hw_memory_threshold_c physmem=$h
w_memory_size_gb[$thishost](gB) memfree=$hw_memory_free[$thishost](gB))
Description:
This server has very little free memory. On HP-UX, this is bad for performance (paging
and swapping increase).
Corrective Action:
Reduce memory usage or increase the amount of memory. If you have any Oracle
instances with overallocated SGA memory, reduce memory consumption.
DBA930 - The instance default PERM tablespace is set to SYSTEM - Could cause performance problems
Description:
The DEFAULT TEMPORARY or PERMANENT tablespace is set to SYSTEM. If non-dictionary
objects are create in SYSTEM, performance problem probably will result.
Corrective Action:
ALTER DATABASE DEFAULT TABLESPACE tsname; (10g+ only)-or-
ALTER DATABASE DEFAULT TEMPORARY TABLESPACE tsname; (9i+ only)
DBA931 - RMAN Process $F5 was automatically KILLED
Description:
An RMAN process was found running on this server which:
This is indicative of an orphan RMAN process which is consuming CPU resources and is not accomplishing anything useful. DBAmon
has issed the OS kill command against this process.
Corrective Action:
(None)
DBA932 - UX NUSERPROC (HP-UX maxuprc Kernel Value) $os_nuserproc_pct[$thishost]% Used - MAXUPRC=$os_maxuprc[$thishost] OSUserProcCount=$os_ nuserproc[$thishost] (Thresholds: $t_nuserproc_w[$thishost]%/$t_nuserproc_c[$thishost]%)
Description:
This event is unique to HP-UX. There is a UX kernel parameter maxuprc. This parameter
controls the number of OS processes that any 1 UX userid can have running concurrently. If
this is exceeded, the OS will not fork any new processes until the process count is reduced
below this value. This can be VERY BAD for a running DB. DBAmon monitors the current OS process
count against the configured maxuprc kernel value as a percentage.
Corrective Action:
DBA934 - Complex DB User Passwords Enforced
Description:
Information only event. This DB has a UTLPWDMG routine active (password_verify_funtion)
in the DEFAULT profile.
Corrective Action:
No action required.
DBA940 - DB Block Corruption - $F3 Block segments
Description:
Rows were found in V$DATABASE_BLOCK_CORRUPTION.
Corrective Action:
Restore the corrupted blocks or datafiles or drop corrupt datafile.
DBA941 - DB Has $autoextend_cnt[$thishost] AUTOEXTEND=YES datafiles
Description:
This DB has at least 1 datafile with Autoextend set to YES. This makes it impossible for DBAmon to monitor for full tablespaces.
Corrective Action:
This event does not indicate a problem with your DB, but DBAmon will only monitor for
tablespace full if you disable this attribute for all tablespaces.
DBA942 - MSSQL Error=18456 Invalid Login(s) Found in Error Log
Description:
DBAmon has detected one or more 18456 SQL Login Failure events during this DBAmon iteration.
Corrective Action:
Depends on your shop. For details of where the failed login attempt originated and the login name, look at the SQL Error Log.
If you're looking for a good guide on how to troubleshoot MSSQL 18456 errors, try:
Aaron Bertrand's Excellent Article.
DBA943 - Found $lr_job_cnt Long Running SQL Job(s)
Description:
DBAmon monitors MSSQL for long running SQL Jobs. This event means that one or more SQL Jobs has been running for at least
12 hours.
Corrective Action:
If this job should be running for more than 12 hours, then add all or part of the Job name to the DBC parameter:
MSSQL_Job_LR_Check_Name_Excl_String: .
DBA944 - Large DB Transaction(s): $tr_reason
Description:
DBAmon monitors MSSQL for long running SQL Transactions. This event means that one or both of these conditions have occurred:
Corrective Action:
If this job should be running for more than 12 hours, then add all or part of the Job name to the DBC parameter:
MSSQL_Job_LR_Check_Name_Excl_String: .
DBA945 - MSSQL SvcAcct=$mssql_service_account[$thishost] Exception - Password Will Expire in $svcacct_return Days at: $svcacct_ts (LE $svcacct_t_crit)
Description:
DBAmon monitors the number of days until the Password for the MSSQL AD Service Account expires. This event
means that either the WARNING or CRITICAL thresholds have been met.
Corrective Action:
You must change the MSSQL Service Account password before it expires, but you also must change the SERVICE Login Credentials.
DBA948 - STACK DUMP(s) Found in SQL Log
Description:
DBAmon found occurrence(s) of STACK DUMP in the SQL Log. This indicates a serious problem with your MSSQL instance.
Corrective Action:
I would first try a Google search. If you don't find a good hit, then open a support case with the vendor.
DBA949 - COMPATIBLE Version ($this_compat) is < DBMS software version ($this_ver)
Description:
The COMPATIBLE parameter is set a full version lower than the version of the DBMS software.
Corrective Action:
Set COMPATIBLE to the same version as the DBMS software.
DBA955 - "IO Slave Count Of $numioslave Is >= $maxioslave_pct% of $maxioslave_cnt Maximum"
Description:
The number of I/O slave processes is approaching 40. This is probably caused by hung RMAN processes, or dbwr_io_slaves set near 40 (the maximum).
Corrective Action:
If 40 is reached, you will not be able to run any RMAN backups. In that
case, bounce the instance.
DBA956 - "Server cron Daemon does not appear to be running - num_cron=$num_cron[$thishost]"
Description:
There is no cron daemon owned by root running on this server.
Corrective Action:
The cron daemon needs to be started. Own this ticket to the OS group.
DBA990 - ","Invalid Data: Drive=($dr_drive) shows a total size of ZERO - Something may be wrong with WMI?","DriveNG","DBA990"
Description:
This is a critical event which is telling you about a near-fatal data problem. The message will vary depending on the problem.
Corrective Action:
Depends. :@) The error should give you a starting place.
DBA998 - "DB=$tl_dbname - TLog with UNLIMITED growth - Drive $tl_drive is $tl_drivepct full ($event_sev_long Threshold of $event_pct% exceeded - TLSize=$tl_tlsize(mB) TLPath=$tl_filename)"
Description:
The transaction log is full or almost full. This particular TLOG does not have a size
limit. So, it is >= 90% full internally, and the disk where the TLOG resides is
full or almost full.
Corrective Action:
Backup the transaction log.
This Document:
http://dbamon.com/errors/index.shtml