2022-09 PROFILE Table usage Part One

This month I begin a two-part topic because it is just too large to do in one blog entry!

In the Beginning

The DSN_PROFILE_TABLE was introduced sometime in DB2 V8, but it was not until DB2 9 that it started to be used for system profiling when IBM introduced three new commands: DISPLAY PROFILE, START PROFILE and STOP PROFILE. This first appearance of PROFILES was a bit limited and could control only a few ZPARMs – and four of those just for EXPLAIN purposes.

How does/did it look?

To get it working, you must first create all the required tables and indexes. (The DDL is in the db2hlq.SDSNSAMP member DSNTIJSG.) In bold and italics are the DB2 10 and higher versions:

SYSIBM.DSN_PROFILE_TABLE
CREATE TABLE SYSIBM.DSN_PROFILE_TABLE 
      ( "AUTHID"                VARCHAR(128)
       ,"PLANNAME"              VARCHAR(24)
       ,"COLLID"                VARCHAR(128)
       ,"PKGNAME"               VARCHAR(128)
       ,"LOCATION" "IPADDR"     VARCHAR(254)
       ,"PROFILEID"             INTEGER       NOT NULL
           PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY
       ,"PROFILE_TIMESTAMP"     TIMESTAMP     NOT NULL WITH DEFAULT
       ,"PROFILE_ENABLED"       CHAR(1)       NOT NULL DEFAULT 'Y'
        ,"GROUP_MEMBER"          VARCHAR(24)                       
       ,"REMARKS"               VARCHAR(762)                       
       ,"ROLE"                  VARCHAR(128)                      
       ,"PRDID"                 CHAR(8)                           
       ,"CLIENT_APPLNAME"       VARCHAR(255)                      
       ,"CLIENT_USERID"         VARCHAR(255)                      
       ,"CLIENT_WRKSTNNAME"     VARCHAR(255)                      
      );                                                          
CREATE UNIQUE INDEX SYSIBM.DSN_PROFILE_TABLE_IX_ALL
      ON SYSIBM.DSN_PROFILE_TABLE                 
      ( "PROFILEID"                               
      );                                          
CREATE        INDEX SYSIBM.DSN_PROFILE_TABLE_IX2_ALL
      ON SYSIBM.DSN_PROFILE_TABLE                 
      ( "PROFILE_ENABLED"                         
       ,"AUTHID"                                  
       ,"PLANNAME"                                
       ,"COLLID"                                  
       ,"PKGNAME"                                 
       ,"LOCATION" "IPADDR"                        
       ,"PRDID"                                   
       ,"ROLE"                                    
       ,"CLIENT_APPLNAME"                         
       ,"CLIENT_USERID"                            
       ,"CLIENT_WRKSTNNAME"                       
       ,"GROUP_MEMBER"                            
       ,"PROFILE_TIMESTAMP" DESC                  
      );                                          

SYSIBM.DSN_PROFILE_HISTORY – Same columns as DSN_PROFILE_TABLE apart from

REMARKS -> STATUS VARCHAR(254) and no index.
SYSIBM.DSN_PROFILE_ATTRIBUTES
CREATE TABLE SYSIBM.DSN_PROFILE_ATTRIBUTES                        
      ( "PROFILEID"             INTEGER       NOT NULL            
           REFERENCES SYSIBM.DSN_PROFILE_TABLE ON DELETE CASCADE  
       ,"KEYWORDS"              VARCHAR(128)  NOT NULL            
       ,"ATTRIBUTE1"            VARCHAR(1024)                     
       ,"ATTRIBUTE2"            INTEGER                           
       ,"ATTRIBUTE3"            FLOAT                             
       ,"ATTRIBUTE_TIMESTAMP"   TIMESTAMP     NOT NULL WITH DEFAULT
       ,"REMARKS"               VARCHAR(762)                       
      );                                                          
CREATE UNIQUE INDEX SYSIBM.DSN_PROFILE_ATTRIBUTES_IX_ALL          
      ON SYSIBM.DSN_PROFILE_ATTRIBUTES                            
      ( "PROFILEID"                                                
       ,"ATTRIBUTE_TIMESTAMP"   DESC                              
       ,"KEYWORDS"                                                
       ,"ATTRIBUTE1"                                              
       ,"ATTRIBUTE2"                                              
       ,"ATTRIBUTE3"                                              
      );                                                           

SYSIBM.DSN_PROFILE_ATTRIBUTES_HISTORY same columns as DSN_PROFILE_ATTRIBUTES apart from

REMARKS -> STATUS VARCHAR(254)  and no index.

Notice the RI between the DSN_PROFILE_ATTRIBUTES and DSN_PROFILE_TABLE keyed on PROFILEID. Also notice that there are no indexes on the HISTORY tables and also no RI.

So What Could You Do?

With this new functionality you could use a profile to override four ZPARMs, namely NPGTHRSH, OPTIOWGT, STARJOIN and SJTABLES. To do so, you first inserted a row into the DSN_PROFILE_TABLE with some sort of filter, at this time only COLLID and PKGNAME, and then one or more inserts in the DSN_PROFILE_ATTRIBUTES table using the PROFILEID that you either just used, or got generated for you, in the DSN_PROFILE_TABLE using the KEYWORDS column and ATTRIBUTEn column(s).

Always on?

The column PROFILE_ENABLED in the DSN_PROFILE_TABLE informs Db2 whether or not to consider this profile when the START PROFILE command is issued. Setting it to N puts all of this profile’s records “to sleep”.

Not Just ZPARMs

It also enabled three global changes (no filters allowed) for BPname, MAX_RIDBLOCKS and SORT_POOL_SIZE. All of these are just for modelling production systems in test to then get a better, more accurate, EXPLAIN result and have *no* effect on the actual system at all.

Finally, IBM added some Accelerator-only support which had to be done with IBM involved.

Interestingly enough, there was a complete chapter about using profiles to monitor and report on SQL but there was also an update to the docu:

Important: The use of profile tables to monitor and capture information about the performance of SQL statements is deprecated, and not recommended.

So, I will not even bother going into detail about the monitor settings.

What Was the Difference?

The major difference between the SQL and ZPARM settings, was the ability to use different filter column values like AUTHID or IPADDR/LOCATION.

The DSN_PROFILE_HISTORY has the same columns as the DSN_PROFILE_TABLE, except that REMARKS is called STATUS and gets a value set by the START PROFILE command. Basically, a string that starts with REJECTED – or ACCEPTED – and then a text string describing why the profile was, or was not, accepted for use.

What’s in an ATTRIBUTE?

The DSN_PROFILE_ATTRIBUTES table contains the option that should be overridden when the Profile is active and the filtering allows it. The columns of interest are KEYWORDS and the three ATTRIBUTEn columns.

BUFFERPOOL Modelling

BPname (where name is any of the valid names like 0 through 49 or 32K1 through 32K9 etc.) in the KEYWORDS column. ATTRIBUTE1 and ATRIBUTE3 are set to NULL and ATTRIBUTE2 contains a positive integer value for the size of the BUFFERPOOL (for production modelling).

RIDPOOL Modelling

MAX_RIDBLOCKS in the KEYWORDS column, ATTRIBUTE1 and ATTRIBUTE3 are set to NULL, and ATTRIBUTE2 contains a value from 0 to the maximum value that you can set MAXRBLK in that subsystem (for production modelling).

STARJOIN Control

STAR JOIN in the KEYWORDS column, ATTRIBUTE2 and ATTRIBUTE3 are set to NULL, and ATTRIBUTE1 set to DISABLE or ENABLE.

MIN STAR JOIN TABLES in the KEYWORDS column, ATTRIBUTE1 and ATTRIBUTE3 are set to NULL, and ATTRIBUTE2 contains a value from 3 to 225.

INDEX ACCESS Control

NPAGES THRESHOLD in the KEYWORDS column, ATTRIBUTE1 and ATTRIBUTE3 are set to NULL, and ATTRIBUTE2 contains one of the following values:

 -1 use index access if possible

 0 access path based on cost, the normal way Db2 works

 1 to nnnn Db2 should use index access on tables for which the total number of pages (NPAGES) is less than nnnn. Make sure that your Db2 Catalog statistics are up to date before you specify a value of 1 or greater.

IO Control

IO WEIGHTING in the KEYWORDS column, ATTRIBUTE2 and ATTRIBUTE3 are set to NULL, and ATTRIBUTE1 is set to DISABLE or ENABLE (deprecated in DB2 10).

SRTPOOL Modelling

SORT_POOL_SIZE in the KEYWORDS column, ATTRIBUTE1 and ATTRIBUTE3 are set to NULL, and ATTRIBUTE2 set to a positive integer up to the maximum value of SRTPOOL. That is the new SRTPOOL (for production modelling).

Production Modelling

In this case, the EXPLAIN output got changed to output which PROFILE value was active at the time of the EXPLAIN. The REASON column in the DSN_STATEMNT_TABLE gets set to “PROFILEID nnnn” for the profile number that was active at the time of the EXPLAIN.

When Was this Done?

The DSN_PROFILE_ATTRIBUTES_HISTORY has the same columns as the DSN_PROFILE_ATTRIBUTES_TABLE except that REMARKS is called STATUS and gets a value set by the START PROFILE command. Basically, a string that starts with REJECTED – or ACCEPTED – and then a text string describing why the profile was, or was not, accepted for use.

So that was it for DB2 9 – not that much but a very good start if you ask me!

System Profile Monitoring

Then in DB2 10 came “system profile monitoring”, which is where this system got very useful indeed! It then got the ability to Monitor Connections, Monitor Threads and Monitor Idle Threads.

New Keywords for Connections and Threads!

MONITOR CONNECTIONS in the KEYWORDS column, ATTRIBUTE1 is a “two part” column value. The first part is either WARNING or EXCEPTION. A warning causes a console message every five minutes depending on the diagnosis level. An exception issues the diagnosis level and rejects any new incoming connection requests. The second part is either not there or it is _DIAGLEVEL1 which issues a DSNT771I console message and _DIAGLEVEL2 which issues a DSNT772I console message with more details. ATTRIBUTE2 is a positive integer to indicate the threshold for the maximum number of remote connections. It must be less than or equal to CONDBAT. ATTRIBUTE3 is NULL. Filtering is only by the LOCATION column.

MONITOR THREADS in the KEYWORDS column, ATTRIBUTE1 is a “two part” column value. The first part is either WARNING or EXCEPTION. A warning causes a console message every five minutes depending on the diagnosis level. An exception issues the diagnosis level and can cancel the thread depending on the filtering criteria otherwise the thread is queued. The second part is either not there or it is _DIAGLEVEL1 which issues a DSNT771I console message and _DIAGLEVEL2 which issues a DSNT772I console message with more details. ATTRIBUTE2 is a positive integer to indicate the threshold for the maximum number of server threads. It must be less than or equal to MAXDBAT. ATTRIBUTE3 is NULL. Filtering on nearly all columns is allowed.

Db2 11 Docu Update

In Db2 11, an extra bit of documentation was added when filtering by Collection identifier, package name, client user name, client application name or client workstation name. When the total number of queued and suspended threads exceeds the threshold, Db2 fails subsequent SQL statements and returns SQLCODE -30041 to the client.

For example, suppose that a profile for a package is started. That profile uses ATTRIBUTE2=2. If five threads request to run the package, two threads run concurrently, two threads are queued and suspended, and Db2 fails the SQL statements for the fifth thread.

And Finally IDLE?

MONITOR IDLE THREADS in the KEYWORDS column, ATTRIBUTE1 is a “two part” column value. The first part is either WARNING or EXCEPTION. A warning causes a console message every five minutes, depending on the diagnosis level. An exception issues the diagnosis level and cancels the idle thread. The second part is either not there or it is _DIAGLEVEL1 which issues a DSNT771I console message or _DIAGLEVEL2 which issues a DSNT772I console message with more details or WARNING_MESSAGE_FOR_IDLE_TIMEOUT (only for WARNING) which issues DSNT771I and/or DSNT773I. ATTRIBUTE2 is a positive integer to indicate the threshold for the maximum number of seconds an active server thread can stay idle.

That’s all for this month, next month I will go into detail about the Filters, the new stuff In Db2 11, 12 and 13 as well as examples of different things you can do nowadays.

As always I would love to hear any comments or criticism about this topic!

TTFN

Roy Boxwell

2022-08 IDUG Boston Review

This month is a quick run through of the z/OS presentations from the IDUG NA22 Boston – the first in-person event in three years!

It was great to actually meet and greet real people again! The only problem I had, was the extreme cold of the Hotel rooms: The expo was set to be like a freezer and, for a European person whose normal air-conditioning is an open window, it was a pretty uncomfortable experience.

If you were there physically, or even virtually, you can now download all the PDFs from all the tracks, so I have grabbed all the A, B, E Tracks and half of the F and G Tracks (Only the z/OS relevant stuff for me!)

Off we Go in Alphabetical Sequence

A01 Create value from data and where the DBA counts is an excellent overview of the modern world and where data and DBAs sit. It also contains a bunch of very nice SQL that you can indeed simply run in your shops as cut-and-paste. (I did!)
access content at IDUG.org (appropriate IDUG access required)

A02 was a very good intro into all the performance changes in Db2 12 and 13 (Check out my Db2 13 blog post as well for that matter!) and also on the hardware side with z15 and the brand new z16 box!
access content at IDUG.org (appropriate IDUG access required)

A03 & A04 were all about AI, including a bunch of example SQLs for the three new AI BiFs in Db2 13 and how to get it all working, as well as Distributed Connection control.
access content at IDUG.org (appropriate IDUG access required)
access content at IDUG.org (appropriate IDUG access required)

A05 was all about getting “value” from your Db2. Are you really using all the “newest” functionality that you could?
access content at IDUG.org (appropriate IDUG access required)

A06 from Haakon Roberts was an excellent update all around Utilities and latest APARs. The highlight being, at least for me, the ICLIMIT TAPE for REORG which finally enables easy migration to UTS PBR RPN Tablespaces. The heads up about the LOAD FORMAT DELIMITED was also good!
access content at IDUG.org (appropriate IDUG access required)

A07 concerned The Trilogy of originating SQLs and how to measure and tune them. At the end was an extra part called “Additional Tuning” that is well worth a read, as it fully explains the internal Db2 data flow from SQL to RDS to DM to Media manager.
access content at IDUG.org (appropriate IDUG access required)

A08 was all about SWAT Tales and had some great interaction! The best bit was one attendee stated that his Db2 system is going through 93 x 768 GB log datasets in less than 6 hours… The important take-away was to keep up to date with PTFs, especially HIPERs (Here you can subscribe to my monthly APAR reports to aid in this), and make sure you have enough logs! Plus, take care with high performance DBAT usage. Finally: Watch out for over- and mis-use of PBG spaces!
access content at IDUG.org (appropriate IDUG access required)

A10 was all about migrating to UTS and to Db2 12 – Do not underestimate the time and effort required to do this!
access content at IDUG.org (appropriate IDUG access required)

A11 contained details about using the RTS to work as a “monitor” enabling you to get a different view of GETPAGES for example.
access content at IDUG.org (appropriate IDUG access required)

A12 tied in with the A01 presentation and was all about Dynamic SQL problems and solutions including a nice way to “purge” single SQLs from the DSC! Included some very interesting SQLs to calculate your DSC KPIs.
access content at IDUG.org (appropriate IDUG access required)

A13 was all about configuring Data Sharing as well as solving some common issues with it. It was a great intro to everything DS. Plus, it contained a list of new and improved things that came along in Db2 12 and 13.
access content at IDUG.org (appropriate IDUG access required)

A14 was all about inactive data impacting your performance. A very interesting topic that all sites probably have an issue with without really knowing it! Archive Enabled Tables could be very useful… Towards the end were a couple of nice features in Db2 with UNION ALL.
access content at IDUG.org (appropriate IDUG access required)

A15 was a journey through TRACES, SMF and IFCIDs. If you ever wanted to know about any of these things then here’s the best starting place!
access content at IDUG.org (appropriate IDUG access required)

A16 launched Python at the DBAs and sysprogs! Scary stuff! All about installing Python on z/OS and latest bug fixes etc etc. My favorite bit was the “disable auto commit” and “remember to commit before disconnect”!
access content at IDUG.org (appropriate IDUG access required)

B-Track

B01 was all about getting prepared for Db2 13. Starting with a review of all the FLs of Db2 12 right up to FL510 which is the major prereq for Db2 13, of course!
access content at IDUG.org (appropriate IDUG access required)

B02 contained a ton of details all about the “Black hole” of Db2 statistics – Page Latch Suspensions, plus a very handy list of how to fix these suspensions – if at all possible…
access content at IDUG.org (appropriate IDUG access required)

B03 took you into the first steps of the Machine Learning (ML) world. Started off with penquins and then I got sort of lost … 🙂
access content at IDUG.org (appropriate IDUG access required)

B04 was another migration session about getting from Db2 12 FL501 to Db2 13, this time incorporating Deprecated functions and Incompatible changes etc.
access content at IDUG.org (appropriate IDUG access required)

B05 gave us four different ways to migrate away from multi-table tablespaces to PBGs. From Unload/Drop/Create/Load, MOVE TABLE (Db2 12 FL508), create “%_new” tables => INSERT from original => rename original to “%_old” => rename “%_new” to original => drop “%_old”, and lastly, using a vendor tool to do the work for you!
access content at IDUG.org (appropriate IDUG access required)

B06 was all about the pre-migration query DSNTIJPE and what you do, or don’t do, with the resulting 23 odd reports.
access content at IDUG.org (appropriate IDUG access required)

B07 showed how MasterCard monitors any and all Db2 Alerts to take proactive actions before things go pear-shaped. This includes disk space, messages, sql codes, access path changes, memory, storage, DDF, physical media limits (size, extents, volumes etc.) There is a very handy full list of “things to monitor” at the end of the presentation as well. Check out the RTDX SAX tool timings!
access content at IDUG.org (appropriate IDUG access required)

B08 Explain explained – Complete introduction as to how the Db2 Optimizer makes its cost-based decision. At the end were a couple of nice “best practices” slides summing it all up very well.
access content at IDUG.org (appropriate IDUG access required)

B10 Db2 for z/OS housekeeping. This was all about a methodology for REORG/RUNSTATS/REBIND. The interesting take away here was the idea to *never* run a RUNSTATS based soley on RTS counters from the last RUNSTATS. In other words, just do a RUNSTATS when you are doing a REORG.
access content at IDUG.org (appropriate IDUG access required)

B11 was all about client configuration and was a cross-platform presentation (naturally!) It contained all you need to know about the setup and installation and use of the db2cli among many other things!
access content at IDUG.org (appropriate IDUG access required)

B12 had Tips for DBAs and programmers to help reduce costs – Always a good topic! In here was also a nice tip about keeping up to date with your COBOL compiler!
access content at IDUG.org (appropriate IDUG access required)

B13 got secure on us by using Multi-factor Authentication for Db2 z/OS. This included setting up MFA and examples of when it works or does not work.
access content at IDUG.org (appropriate IDUG access required)

B14 carried on the security theme by going into detail about how to protect yourself from Ransomware attacks. Here multi-layer protection is the best – MFA, Pervasive encryption, Separation of duties (SECADM usage…), Controlling access to Db2 datasets etc. etc.
access content at IDUG.org (appropriate IDUG access required)

B15 came back to more “normal” territory about stopping runaway applications by using the RLF tables DSNRLSTxx and/or DSNRLMTxx, including a nice selection of examples to give you a head start.
access content at IDUG.org (appropriate IDUG access required)

B16 presented a way to use MS Excel to help in analyzing performance data. A nice introduction into getting data down to the PC and then using advanced plug-ins like ToolPak.
access content at IDUG.org (appropriate IDUG access required)

E-Track

E01 was all about the Optimizer and its various access path and resultant performance. Tons of notes all about access paths make this well worth a read!
access content at IDUG.org (appropriate IDUG access required)

E02 was a recap of Continuous Delivery, going over the why’s and how’s including vendor responses, and then ran through all the FL levels that we have so far had.
access content at IDUG.org (appropriate IDUG access required)

E03 SQL Performance for application developers was an introduction, with examples, about what an application developer should know about SQL at a minimum!
access content at IDUG.org (appropriate IDUG access required)

E04 was one of my presentations all about esoteric Db2 functions – Db2 stuff that is rarely used or not well understood. Covering FIT/FTB, Spatial Indexes, REGEX, Clones and scrollable cursors. All good fun!
access content at IDUG.org (appropriate IDUG access required)

E05 was all about IBM Db2 Developer Extension and Db2 Administration Foundation – Obviously the live demos are missing but it gives you a good idea!
access content at IDUG.org (appropriate IDUG access required)

E06 Advanced Db2 Performance Tuning for Beginners – the title says it all. Six objectives done great by Joe – It covered both LUW and z/OS and contained a “Steps to Solve the Crime” section.
access content at IDUG.org (appropriate IDUG access required)

E07 was a run through of all good stuff we got in Db2 12 including comparisons between 11 and 12 and an introduction to RESTful calls.
access content at IDUG.org (appropriate IDUG access required)

E08 was a plea for testing. How to generate test data and how to actually test and measure. Included examples of PLSQL to generate test data, and proposes the mantra to Measure and Monitor what you are doing and what you have done.
access content at IDUG.org (appropriate IDUG access required)

E10 was another one of my presentations where I go into detail about all currently deprecated features of Db2 12 and 13. It gave pages of SQL that you can use to check your own Db2 subsystem, or you can download our freeware MHC2 Migration HealthCheck program that does it all for you. (This is continually updated whenever anything new is deprecated, by the way!)
access content at IDUG.org (appropriate IDUG access required)

E11 all about “Things your DBAs hear”. A very good, light-hearted look at the “normal craziness” of being a DBA these days!
access content at IDUG.org (appropriate IDUG access required)

E12 A DBA’s epic journey covered how to deal with SLOW SQL and then the taming of four common SQL “problem statements”.
access content at IDUG.org (appropriate IDUG access required)

E13 was a very apt Session code! All about the usage and requirement of RECOVER these days. It covered why you should be able to do it and preparing for it as it will be required at some time…
access content at IDUG.org (appropriate IDUG access required)

E14 was a modernization call for Db2 stored procedures and RESTful services. Examples were included as well as Hints & Tips especially around DSNULI, Parameters and File usage of existing stored procs.
access content at IDUG.org (appropriate IDUG access required)

E15 covered how to fall back from a schema change as quickly as possible! Use of high speed flash copies to a clone show you a way to handle this.
access content at IDUG.org (appropriate IDUG access required)

E16 An overview of a “true” HTAP system. This showed how using an accelerator processing the logs you can indeed get to the Holy Grail of Transactional and Analytical processing happening at the same time on the same data.
access content at IDUG.org (appropriate IDUG access required)

F-Track

F01 was all about JAVA performance – and we all need better JAVA performance these days! Kudos for the callouts on Spring Batch and Hibernate.
access content at IDUG.org (appropriate IDUG access required)

F04 Back to basics with Db2 Buffer Pools – Covered everything you would ever need to know about Db2 Buffer Pools! Set-up, Monitor, Configure and Tune.
access content at IDUG.org (appropriate IDUG access required)

F06 explained the use of Indexes, how they look internally and all about performance, including when to REORG them at the optimal moment.
access content at IDUG.org (appropriate IDUG access required)

F07 SQL went crazy using Pivot and Transpose, some for z/OS some for LUW – a real smorgasbord of SQLs!
access content at IDUG.org (appropriate IDUG access required)

F11 contained a ton of detail about connecting Clients to Servers which is not quite as straightforward as some people think…
access content at IDUG.org (appropriate IDUG access required)

F12 Ran through the Db2 Catalog and Directory as it was, as it is and how to migrate to Db2 13.
access content at IDUG.org (appropriate IDUG access required)

F13 covered how to use the TRACE facility of Db2 including all the information you could ever want to know about which Trace is which class is which IFCID…
access content at IDUG.org (appropriate IDUG access required)

F14 was all about the perennial problem of Db2 logging and Commit frequency including full information about what is logged, what is written in the BSDS and adding/removing Active Logs.
access content at IDUG.org (appropriate IDUG access required)

F16 was DSC (Dynamic Statement Cache) usage, how it actually works, how to improve it and a quick glimpse into using the IDAA (Accelerator).
access content at IDUG.org (appropriate IDUG access required)

G-Track

G02 discussed the requirement for a Next Generation DBA. Having fewer people with the skills drives the demand for AI to help out.
access content at IDUG.org (appropriate IDUG access required)

G03 was very interesting as it was all about setting up Encryption through the SECPORT which is becoming standard these days. Full of configuration Hints & Tips. Also contained a full example of running NETSTAT and loading the output up into a Db2 table every few minutes so you can analyze who is accessing using just the TCPPORT – Heaven!
access content at IDUG.org (appropriate IDUG access required)

G05 AI again but this time protecting your systems from bad DBAT problems.
access content at IDUG.org (appropriate IDUG access required)

G06 Running through old Db2 releases up to current with special regard to the problem of RECOVERY and availability as they have changed over the years.
access content at IDUG.org (appropriate IDUG access required)

G12 went into depth about cutting back-up costs by using a hybrid-cloud multi-temperature storage system. Using Db2 for z/OS Data Gate delivered through IBM Cloud Pak for Data enables all of this. The big idea here, was to take your rarely used archive data and move it into the cloud.
access content at IDUG.org (appropriate IDUG access required)

G13 brought up the use of Redirected Recovery to ease your fears of recovery. You can simply validate, with no system interruptions of any kind, whether or not you are indeed even recoverable and, most importantly, how long it really takes.
access content at IDUG.org (appropriate IDUG access required)

G14 went into the IBM Cloud Pak world again, this time with virtualization being the main theme. The fact that the data lake has “dried up” due to various problems (GDPR being amongst them!) leads to virtualization being the way forward. DaaS – Data as a Service.
access content at IDUG.org (appropriate IDUG access required)

G16 and finally… we get to the last one, and it is a *very* big one all about Db2 Security Best Practices from David Beulke. An absolute treasure trove of Do’s and Do Not’s all related to the world of Audit. Our WLX Audit also gets a shout out so well done for that!
access content at IDUG.org (appropriate IDUG access required)

Summary

All in all it was a vast amount of information to try and take-in. IDUGs are always places of learning and I always learn stuff – I am now really looking forward to the IDUG EMEA 2022 in Edinburgh coming up from October the 22nd through to the 26th.

I hope to see you there!

TTFN,

Roy Boxwell

2022-07 IBM problem data requests…

Most of us have been there … something somewhere goes wrong … things are checked, changes are undone, tests are re-run and in the end you have no idea why a failure happens.

Who You Gonna Call?

Yep, it is time to open a Case at IBM technical support … So you open a Case and you type in as much detail as possible about when and what happened but it is *never* enough! In the world of Db2, the first question that *always* comes back is “Please supply us with further information”, like:

  • SYSLOG
  • Master Log
  • MEPL
  • Detailed EREP
  • Complete SVC dump

WTF? (“What’s That For” before anyone complains)

SYSLOG

The syslog is the console of a z/OS system and any and all interesting, and sometimes not so interesting, messages from *all* running “things” are in here – it is normally enormous! The problem begins when IBM Technical Support asks “please provide us with the SYSLOG from 06:00 to 06:30 on the day of the event”.

SDSF

SDSF is your friend here and I really mean it! All you do is go to SDSF and then enter primary command LOG. From this panel you enter three primary commands, one after another, and you are done!

  • PT ODSN ‘your.dataset.name’ * NEW
  • PT 06.00.00 22/06/2022 06.30.00 22/06/2022
  • PT CLOSE

That is it! Your dataset will then just have the data from between those times. This is *extremely* handy! Note that the date format is locale-dependent and, as I am in Europe, we have DD/MM/YYYY. I am sure you know your own date format!

Master LOG

This is the first SDSF dataset in your ssidMSTR STC. So, once more in SDSF, using *MSTR as a prefix and then putting line command ? next to the sub-system in question shows you three DDNAMEs. The first one, JESMSGLG, is the one they normally need. Here you use line command XDC to get an SDSF Open Print Data Set window:

xxxxMSTR STC09394           SDSF Open Print Data Set                         
COMMAND INPUT ===>                                         SCROLL ===> CSR
                                                                             
                                                                             
Data set name  ===> 'xxxxxxx.SYSLOG.PRINT'                                   
Member to use  ===>                                                          
Disposition    ===> NEW        (OLD, NEW, SHR, MOD)                          
                                                                             
Management class     ===>           (Blank for default management class)     
Storage class        ===>           (Blank for default storage class)        
  Volume serial      ===>           (Blank for authorized default volume)    
  Device type        ===>           (Generic unit or device address)         
Data class           ===>           (Blank for default data class)           
  Space units        ===> CYLS      (BLKS, TRKS, CYLS, BY, KB, or MB)        
  Primary quantity   ===> 19        (In above units)                         
  Secondary quantity ===> 19        (In above units)                         
  Directory blocks   ===>           (Zero for sequential data set)           
  Record format      ===> FBA                                                
  Record length      ===> 121                                                
  Block size         ===>                                                    
Data set name type   ===>           (LIBRARY, blank, ... See Help for more)  
Extended attributes  ===>           (NO, OPT, or blank)                      

Here you can see I choose type FBA, LRECL 121 and a disposition of NEW for a new dataset. Hit ENTER and SDSF tells you how many lines it just wrote to that file:

PRINT CLOSED  23025 LINE

View the file and max down to the bottom:

023019 0------ JES2 JOB STATISTICS ------        
023020 -  17 MAY 2022 JOB EXECUTION DATE         
023021 -            2 CARDS READ                 
023022 -       28,616 SYSOUT PRINT RECORDS       
023023 -            0 SYSOUT PUNCH RECORDS       
023024 -        3,099 SYSOUT SPOOL KBYTES        
023025 -    50,488.70 MINUTES EXECUTION TIME     

So we know we are in the correct file! Here you can do some updating of “sensitive” data like IP address, User Name etc. Remember to just change the data, not blindly delete it! Naturally, you can delete stuff *after* the event of interest and probably a ton of stuff from *before* but be careful what you delete!

MEPL

Say what? MEPL is the Module Entry Point List and IBM need it to see which PTFs and APARs have been applied in the application address space and the Db2 system. To get a MEPL I use a normal Utility job jcl with DIAGNOSE and a DISPLAY MEPL like this:

//MEPL     EXEC PGM=DSNUTILB,REGION=32M,       
//         PARM=(ssss,'DIAGNOSEMEPL')          
//STEPLIB  DD DISP=SHR,DSN=DSNsss.SDSNEXIT.ssss
//         DD DISP=SHR,DSN=DSNsss.SDSNLOAD     
//CEEDUMP  DD SYSOUT=*                         
//SYSUDUMP DD SYSOUT=*                         
//SYSPRINT DD SYSOUT=*                         
//SYSIN    DD *                                
 DIAGNOSE                                      
    DISPLAY MEPL                               
 DIAGNOSE END                                  
/*                                             

This will output the MEPL to SYSPRINT which starts like this:

DSNU000I    173 12:57:08.82 DSNUGUTC - OUTPUT START FOR UTILITY, UTILID = DIAGNOSEMEPL                               
DSNU1044I   173 12:57:08.83 DSNUGTIS - PROCESSING SYSIN AS EBCDIC                                                      
DSNU050I    173 12:57:08.83 DSNUGUTC -  DIAGNOSE DISPLAY MEPL                                                          
DSNU861I    173 12:57:08.84 DSNUDIAG - DISPLAY MEPL FOR SUBSYSTEM xxxx                                                 
    0000 20B92820 C2C5D7D3 0140D4C5 D7D360D3  C9D2C540 C6D6D940 C4E2D5E4 E3C9D3C2    *....BEPL. MEPL-LIKE FOR DSNUTILB*
    0020 28100000 C4E2D5C1 C1404040 F0F761F1  F461F1F6 E4C9F3F9 F3F9F340 00000000    *....DSNAA   07/14/16UI39393 ....*
    0040 28100100 C4E2D5C1 D7D9C840 F1F261F2  F361F1F5 F1F34BF4 F6404040 00000000    *....DSNAPRH 12/23/1513.46   ....*
    0060 28100200 C4E2D5C6 D4D5C6D4 F1F061F1  F761F1F8 E4C9F5F8 F8F4F040 00000000    *....DSNFMNFM10/17/18UI58840 ....*
    0080 28100240 C4E2D5C6 D7D4E2C7 F1F061F1  F761F1F8 E4C9F5F8 F8F4F040 00000000    *... DSNFPMSG10/17/18UI58840 ....*
.
.
.

It is quite long! Here in my test system nearly 5000 lines are written to SYSPRINT. Then, like with the ssidMSTR, I use ? against the job and then XDC against the SYSPRINT DD card this time to create another file with type FBA and LRECL 133 to get your.mepl.list.

Detailed EREP

Now it gets interesting… The EREP (Environmental Record Editing and Printing Program) is the API to the system LOGREC dataset where all “events of interest” on a z/OS LPAR are recorded. It contains far less than the console log but is a treasure trove of data for the IBM Technical Support.

Here’s my job to simply do a Detailed EREP as per IBM standards:

//*------------------------------------------------------------------*/
//*  EREP: DETAILED REP PRINT                                        */
//*------------------------------------------------------------------*/
//EREP     EXEC PGM=IFCEREP1,PARM='CARD'                               
//SERLOG   DD DISP=SHR,DSN=xxxxxxxx.LOGREC                            
//DIRECTWK DD DISP=(NEW,DELETE),UNIT=SYSDA,SPACE=(CYL,(50,50))         
//EREPPT   DD SYSOUT=*,DCB=BLKSIZE=133                                 
//TOURIST  DD SYSOUT=*,DCB=BLKSIZE=133                                 
//SYSIN    DD *                                                        
ACC=N                                                                  
HIST=N                                                                 
ZERO=N                                                                 
PRINT=PS                                                               
TYPE=S                                                                 
/*                                                                     
//* IF REQUIRED YOU CAN ADD DATE, TIME RANGES TO FILTER DOWN           
//* WITHIN THE SYSIN LIKE:                                             
//* DATE=(YYDDD,YYDDD)                                                 
//* TIME=(HHMM-HHMM)                                                   

Do not forget to give your LOGREC DSN for the SERLOG DD. Most of the time I get just a few rows of output and then create another file using XDC from the EREPPT DD name but this time with type FB and LRECL 133 to get the.erep.list

Complete SVC Dump

If your Db2 system receives a dump, for whatever reason, it normally writes out an SVC dump to a special dataset that can be used to analyze what went wrong. It is very important that the SVC dump is complete and *not* partial …

Default Size

The default size is only 500MB which is way too small for a halfway decent production Db2 sub-system these days. It must normally be increased to at least 16000MB. To change this you issue a console command like:

CHNGDUMP SET,SDUMP,MAXSPACE=16000M

But make sure you have enough local page datasets space to handle your normal load PLUS the size of this dump dataset…auxilliary swapping (paging) while dumping is a painfully slow experience you do not want to suffer!

If successful, the SVC dump will be COMPLETE and then you are nearly done …

File Transfer

Most of the files I have described so far are quite small but the SVC dump is a monster. You must TERSE it using JCL like:

//AMATERSE  EXEC PGM=AMATERSE,PARM='SPACK'
//SYSPRINT  DD SYSOUT=*                   
//SYSUT1    DD DISP=SHR,                  
//             DSN=xxxxxxxx.xxxxxxxx       
//SYSUT2    DD DISP=(,CATLG),UNIT=SYSDA,             
//             DSN=xxxxxxxx.xxxxxxxx.TRS,  
//             SPACE=(CYL,(99,99),RLSE)   

I use the SPACK parameter which is, according to the documentation, much better at compression than the PACK parameter. Fun factoid of the day: SPACK is the “complex” format whereas PACK is the “simple” format – Gotta love IBM for that! IBM do prefer the TERSE style of compression, and please do *not* change the file ending! Then doing a ZIP has no real bonus and just confuses the automatic systems at IBM. Leave “.TRS” at the end and they know it has been TERSED.

Then download the xxxxx.xxxxx.TRS file as BINARY to the PC and all the other files as TEXT to the PC. Then simply upload by drag-and-drop to your IBM Case and you are ready for the next question!

Have you Switched it Off and On again?

I wish I never hear this about a mainframe Db2 problem!

I hope this was of some interest, and if you have any other Tips & Tricks about getting “standard” data to IBM, I would love to hear from you!

TTFN

Roy Boxwell

2022-04 A brief history of the Universal Tablespace (UTS) Part Two

This month I wish to finish off, what I started last month, my musings about UTS over the releases and years.

Db2 12 Changes to the UTS Picture

For Partitioned By Range (PBR), a brand-new space was created called the UTS PBR Relative Page Number (RPN) which was, in my opinion, the best thing in Db2 12! Quite simply, it allows the dynamic ALTERing of the partition and all of the related partitioned indexes DSSIZE on-the-fly even when a LOAD is running! This was great! Any users out there who have had a nightmare LOAD? Yep, now, as long as you are actively monitoring your data and index partition sizes, you can issue these ALTERs automatically and never hit the buffers like you do today.

DSSIZE gets propagated through

To enable this feature DSSIZE was improved to be settable at the data partition level and also extended to partitioned indexes. The available values were changed to allow any integer from 1 GB to 1024 GB. This allows extreme flexibility for all sizing requirements. Note, however, that NPSIs are still stuck in the middle of nowhere with just PIECESIZE as a helping hand…

Everything groovy???

So, what was wrong with this picture? Well, the change from PBR to PBR RPN was, shall we say, a little bit painful. The RID size got extended and as the RID is stored in *every* header page it required a tablespace reorg with partition-based inline image copies. Now, as you can well imagine, most people’s PBRs are pretty big, and allocating 4096 Virtual Tapes in parallel was just not going to happen! After a while IBM enhanced REORG so that you could put multiple copies on one tape, sort of like STACK, but much better – and not just for Tape but also for DASD. This has really accelerated the acceptance and usage of PBR RPN.

The future is bright!

Check this Blog entry:

https://www.idug.org/blogs/emil-kotrc1/2021/03/12/why-universal-table-spaces-uts

It is revealed that in Apollo (Db2 for z/OS vNext – Which has now been released as Db2 13 for z/OS), the ability to migrate from a PBG to a PBR will become available instead of the UNLOAD, DROP, CREATE, LOAD method which is the only way up until Db2 12. This will be very handy as the PBR RPN is the best way to go forward with large tablespaces (>60 GB) as long as you have *some* sort of available partitioning scheme, of course!

No more worries??

What do you need to worry about now? Well, you remember that huge LOAD I mentioned earlier that comes at the partition level? You must simply monitor the sizes of your partitioned objects and add a few GBs of space when required, on the fly, with no outage.

How?

Well, we have a little product called SAX+ which does exactly that! It starts the required OPx IFCID traces and computes the values against given thresholds, then automatically issues the ALTERs giving you a seamless experience with PBR RPNs. The only “problem” left now is when you are approaching the 1024 GB absolute PBR RPN physical limit. SAX+ warns you about this as well. Then you will have to either schedule a REBALANCE or a new LIMITKEY definition and REORG to spread the load again. However, when you know this well in advance it is no longer a serious problem!

Not yet at PBR RPN? – No problem!

PBRs which are not yet RPNs are monitored to warn when a threshold of usage in a data or index partition is exceeded. This also gives you more than enough lead time to get the REORG ready to switch it to RPN or just resize/rebalance your partitions.

What about PBGs?

PBGs are also fully supported within SAX+ as it can automatically add partitions, if desired, which additionally avoids SQLCODE -904’s. Plus, SAX+ adjusts the way it works depending on the MAXPARTITIONS and the actual number of allocated partitions. For example, MAXPARTITIONS 10 with a 90% warning would alert you when the ninth partition is allocated and in use. When the tenth partition gets allocated the warning switches from an LDS warning (Running out of available partitions) to a “last partition” filling up warning similar to PBRs which are not yet RPNs. This obviously helps a lot when you have MAXPARTITIONS 1 which is the IBM recommendation these days.

Anything else?

Naturally, SAX+ also takes care of the other problems that can catch you unawares:

  • Running out of Linear Datasets (LOB, PBG, XML, non-UTS space and NPSI)
  • Running out of space
  • Running out of extents and also badly fragmented extents
  • Running out of volumes
  • Running out of physical range for SEQUENCES or IDENTITY columns
  • Running out of physical range for Numeric Primary key columns
  • Running out of DBATs
  • Running out of space in SMS Storage Groups

All of the entries in the above list are very annoying when they hit you with no warning, especially running out of Linear Datasets. Think NPSIs here! Every PIECESIZE is an LDS and so you can run out of these quicker than you think.

LOG problems?

One last thing that SAX+ can help you with is detecting Db2 Log problems together with stalled logs… Not something you normally think of with a space management tool! When Db2 starts running out of space in active logs it issues the DSNJ110E message and I know shops who have “missed” this alert. The problem was that the Db2 Log SMS Storage Group was getting full *and* the tape offload had stalled… As you can imagine this scenario is not pretty, so checking you Log storage groups is probably a good idea to guarantee you do not hit this problem!

That’s enough about UTS for this month. If IBM bring out another TLA for a TLA I will scream!

TTFN,

Roy Boxwell

2022-02 ZPARMs never stop changing!

This month I want to go through some of the absolutely most important ZPARMs that control how your Db2 systems behave in a very significant manner. All of the following ZPARMs have a performance impact of some sort and we are always trying to squeeze the last drop of performance out of our Db2 sub-systems, aren’t we?

Starting with the Easy Stuff…

CACHEDYN. YES/NO, default YES. Should always be set to YES unless you do not care about saving dynamic SQL performance. Back a few decades ago, the recommendation was to have this set to NO as default! Hard to believe that these days, where most shops have 80% – 90% dynamic SQL during the day!

Now we Get to the Numerics!

OUTBUFF. 400 – 400,000, default 4,000. This is *extremely* important and you really should set it to the highest possible value you can afford in real memory! As a minimum, it should be 102,400 KB (100MB). This is the buffer that Db2 uses to write log records before they are “really” written to disk. The larger the buffer, the greater the chance that by a ROLLBACK the data required is in the buffer and not on disk. This is a big win and the default of 4,000 KB is crazy low!

Skeletons in the Closet?

EDM_SKELETON_POOL. 5,120 – 4,194,304, default 51,200. This is one of my personal favorites (I wrote a newsletter solely on this a few years ago) The default is way to small these days. I personally recommend at least 150,000 KB and actually even more if you can back it with real memory. Just like OUTBUFF, pour your memory in here but keep an eye on paging! If Db2 starts to page you are in serious trouble! Raising this can really help with keeping your DSC in control.

DBDs are Getting Bigger…

EDMDBDC. 5,000 – 4,194,304, default 23,400. The DBD Cache is getting more and more important as, due to UTS usage, the size of DBDs is increasing all the time. The default just doesn’t cut the mustard anymore so jump up to 40,960 as soon as you can.

DSC is Always too Small!

EDMSTMTC. 5,000 – 4,194,304, default 113,386. The EDM Statement Cache (really the Dynamic Statement Cache) is where Db2 keeps a copy of the prepared statements that have been executed. So when the exact same SQL statement with the exact same set of flags and qualifiers is executed, Db2 can avoid the full prepare and just re-execute the statement. This is basically a no-brainer and should be set to at least 122,880 KB. Even up to 2TB is perfectly ok. Remember: A read from here is *much* faster than a full prepare, so you get a very quick ROI and great value for the memory invested! Keep raising the value until your flushing rates for DSC drop down to just 100’s per hour, if you can! Remember to cross check with the EDM_SKELETON_POOL ZPARM as well. It always takes two to Tango…

How Many SQLs?

MAXKEEPD. 0 – 204,800, default 5,000. The Max Kept Dyn Stmts parameter is how many prepared SQLs to keep past commit or rollback. It should be set to a minimum of 8,000 or so. Raising this might well cause a large memory demand in the ssidDBM1 address space so care must be taken.

RIDs Keep Getting Longer…

MAXRBLK. 0, 128 – 2,000,000, default 1,000,000. RID POOL SIZE is the maximum amount of memory to be available for RID Block entries. It should be at least 1,000,000 and, if you can, push it to the maximum of 2,000,000. Unless you want to switch off all RID Block access plans in which case you set it to zero – Obviously not really recommended!

Sorts Always Need More Space

MAXSORT_IN_MEMORY. 1000 to SRTPOOL. The maximum in-memory sort size is the largest available space to complete ORDER BY, GROUP BY or both SQL Clauses. Remember that this is per thread, so you must have enough memory for lots of these in parallel. The number should be between 1,000 and 2,000, but whatever value you choose, it must be less than or equal to the SRTPOOL size.

Sparse or Pair-wise Access?

MXDTCACH. 0 – 512, default 20. Max data caching is the maximum size of the sparse index or pair-wise join data cache in megabytes. If you do not use sparse index, pair-wise join, or you are not a data warehouse shop, then you can leave this at its default. Otherwise, set it to be 41 MB or higher. If it is a data warehouse subsystem, then you could set this as high as 512 MB. (This ZPARM replaced the short-lived SJMXPOOL, by the way.)

Sort Node Expansion

SRTPOOL. 240 – 128,000, default 10,000. SORT POOL SIZE is the available memory that is needed for the sort pool. The default is 10,000 KB and should really be set to 20,000 KB at least, if not more! IFCID 96 can really help you size this parameter. Remember that the number of sort nodes leapt up from 32,000 in Db2 11 to 512,000 nodes for non-parallelism sorts and 128,000 nodes for a sort within a parallel child task in Db2 12. This means raising this ZPARM can have an even greater positive effect than before.

Your “Top Ten List”

These ten ZPARMs really influence how your Db2 system works and so must always be changed with great care and attention to detail. Always do a before and after appraisal to see whether or not changing them helped or hindered your system!

And Finally…

I have updated our Pocket Tool, Performance Health Check, to check and report all these ZPARMs, as well as all the other checks like the 6 Byte RBA/LRSN or the Mapping table changes or the reason for REORG etc etc. Feel free to download and run it, as it is but a click away!

If you have any comments, or other ZPARMs you think are also important for performance, feel free to drop me a line!

TTFN,

Roy Boxwell

2022-01 Fazed by phases?

This month I wish to do a quick run through, and review, of the effects of REBIND with active packages and how phases, both in and out, have given us some interesting problems to deal with!

Phase-In

As I mentioned in an earlier blog, the phase-in/phase-out was a very good idea indeed! It finally meant that we could do REBINDs whenever we wanted and no-one had to shut down servers to simply action a REBIND. This was always especially galling when it was an “empty” package or just to get an FL upgrade.

Allowed?

Remember, Phase-in is only allowed with PLANMGMT(EXTENDED) style packages and if the package is not a generated package for a trigger or SQL routine, such as a procedure or user-defined function.

Problems ahead?

The problems began soon after it was introduced in June 2019 with PH09191 and FL505. Now, it looked really good, and indeed it was, but it had one little flaw… it still required a SIX lock to do the business.

Problem solved?

This was solved in January 2021 with APAR PH28693 which changed the lock from a SIX to a much better U lock. With this APAR all of our problems were fixed and life would be good!

Nope…

Sadly, we then found out that the inactive/phased-out packages were actually causing rapid space growth in the SYSPACKCOPY (Not in SYSPACKAGE!) Remember that the phased-out packages all get copied across with a new number to the “back-up” SYSPACKCOPY for use until the active thread finally disconnects.

The Problem Gets Worse

Now these backed-up, old packages are sitting around in SYSPACKCOPY waiting until the next REBIND comes along whereupon Db2 will attempt a FREE. Various customers noticed that, for their intensely used packages, this free never really happened and so they “hit the buffers” of 14 copy_ids for packages… very nasty!

Another APAR to the Rescue!

IBM then created a new APAR, PH33295, that enhanced the FREE PACKAGE command to have a new PLANMGMTSCOPE sub-clause PHASEOUT. All this does is delete all of those old, no longer used packages. However, I can testify that doing a:

FREE PACKAGE(*.*.(*)) PLANMGMTSCOPE(PHASEOUT)

in production is one of the scariest commands I have ever issued!!! Personally, I would have preferred a brand new command like FREE PHASEDOUT PACKAGES or so…

Invalid as Well

While you are ridding yourself of 100’s of packages you could also issue the PLANMGMTSCOPE(INVALID) command which removes the phased-out and also gets rid of the dead, invalid packages which cannot ever be used anyway.

Do not Forget to Tidy up!

Once all these commands have been done, a REORG of the tablespace DSNDB06.SYSTSPKC is very highly recommended!

How have you experienced phase-in and phase-out? Do you think it’s a great feature or not?

Whatever you think, I would love to hear from you!

TTFN,

Roy Boxwell

2021-11 When is an NPI not an NPI?

This month, I wish to discuss everything about indexes but focusing on Non-Partitioned Index’s (from now on NPIs). I must confess that the real name of an NPI is a Non-Partitioned Secondary Index but I really dislike that as a term…

In the Beginning was the Index

Well, actually, it was the data, but then we needed a quick way of accessing the data and so the index was born. In Db2, these are b-tree based indexes with a few system pages and then a hierarchy of non-leaf pages starting with the root page, which then branches down in levels until it finally hits the leaf page – and these contain the direct pointers to the data. Very fast and just a few getpage I/Os to read the non-leaf/leaf pages in.

Hope Springs Eternal

The hope is that all the system stuff, the root page, and quite a few other non-leaf pages will be hanging around in the bufferpool and so no real I/O is required, but you all know the real world? It never is like that!

Bigger and Bigger

Tables and tablespaces got bigger and bigger over time, so then the Partitioned Tablespace was born. At first it required a partitioning index (PI), but nowadays you should all be using table based partitioning.
Brief aside: Recently, there was a mega outage caused by a really nasty bug in Db2 that brought everything crashing down (PH41947). Root cause was the use of index-based partitioning – still! Do IBM test all these old things on all levels of code? Nope. Time to get my Migration HealthCheck freeware and migrate away from index-based partitioning as fast as you can!

A Star was Born

Well ok, not really a star but the PI started life a little bit dazed and confused, and the formula for calculating “how big can my PI can get before it goes *boom*” is also impressive! (Check out my Know Your Limits presentation for the formulae!)

And Then Came the NPI

Now this PI was naturally unique, but we all needed a way to check for other data and possibly non-unique column combinations. So the non-partitioned index (NPI) was born. This is an index over the *complete* partition range and was normally heavily used by application SQL.

Pain points began

First major problem was the REORG of a single partition. It did the Tablepart (TP) data and it did the Index Part (IP), but it also had to update all the pointers within the NPI. This used to be terribly slow. At one shop, when reorging 22 partitions of a 254 partition TS it took 26 hours at the TP level and only 18 hours at the complete TS level. This is when we added to our RealTime DBAExpert utility generation software the ability to “roll up” from a TP reorg to a TS reorg based on number of partitions being REORGed, percentage of partitions being REORGed, or whether or not at least one NPI existed.

A Piece of Cake?

The other major problem with NPIs is that they can have a PIECESIZE, which is how large a Linear Page Set (LPS) can get before a new one is created. You can have from 1 to 4096 pieces for a single NPI, and the calculation to work out how many pieces you are allowed to have is another brain-numbingly complex one – which is also in my limits presentation.

Some sample SQLs for you:

-- FIRST YOU MUST CALCULATE THE MAXIMUM NUMBER OF POSSIBLE PARTITIONS
SELECT INTEGER(MIN(4096 , POWER(2E00 , 32)
/ (
(64E00 * 1024E00 * 1024E00 * 1024E00) -- DSSIZE TS
/ (32 * 1024) -- TS PGSIZE
)
)) AS MAX_NBR_PARTS
FROM SYSIBM.SYSDUMMY1 ;
-- THEN YOU PLUG THAT NUMBER INTO HERE
SELECT MIN(
(64E00 * 1024E00 * 1024E00 * 1024E00) -- DSSIZE TS
, (POWER(2E00 , 32) / 2048 -- NBR PARTS
) * 4096 -- IX PGSIZE
) AS MAX_INDEX_SIZE
FROM SYSIBM.SYSDUMMY1 ;
-- FOR AN NPI YOU JUST NEED THE PIECESIZE AND THE INDEX PAGESIZE
SELECT INTEGER(MIN(4096 , POWER(2E00 , 32)
/ (
(8E00 * 1024E00 * 1024E00 * 1024E00) -- PIECESIZE
/ 4096 -- IX PGSIZE
)
)) AS MAX_PIECES
FROM SYSIBM.SYSDUMMY1 ;

Death by Index

Suffice it to say you can be killed by running out of space in your PI, running out of space in your NPI, or even running out of NPI LPS pieces…

A New Type of Index

A brand new index type, data-partitioned secondary index, DPSI was created in Db2 V8 so that you could access partitions without having to scan thousands of datasets. Naturally, DPSIs only work if the application SQL is cognizant of their definition and index. Failure to do so can give terrible performance. These are naturally also bound to their respective partitions. It was also at this time that NPIs got rebranded to be NPSIs…

Separation for Utilities is Good

The major utility benefit that partitioned TSs give, is the ability to run them in parallel, including RECOVER. It was a really great idea so that you would only REORG, COPY or RUNSTATS the partitions that required it, and all the partition level utilities could run in parallel.

Db2 9 was All Change

To make these REORGS faster, IBM changed the way REORG worked with partitioned objects in Db2 9. They did speed it up but it meant that you could no longer work in parallel! You could add ranges to the PARTs being reorged but still not in parallel. This caused much grief and gnashing of teeth amongst loads of customers. Thus, we developed the so-called NPI Job. Anything that has a dependency on an NPI could be squirreled away in the NPI job(s) to then be run sequentially by the job scheduler after all other jobs were finished. Nasty!

Faster, Better, Cheaper

Now, after over 35 years, the b-tree is showing its age. IBM came up with the FTB or FIT which, simply put, copies all the non-leaf pages into a private area of the bufferpool and then, using L2 cache-aware code, removes the need for I/Os, apart from the very last two, leaf and data page, and thus reduces I/O and CPU.

Limits Again

Of course this is not for *every* index. Only unique 64 bytes long or less with no IOE, TIMESTAMP with TIMEZONE, or versioning allowed. But hey, it was a start!

Duplicate Allowed

Now even duplicate indexes are there. Well, to start with, only 56 bytes or less but we are getting there! You will require PH30978 UI75643 and then set the new ZPARM FTB_NON_UNIQUE_INDEX to be YES.

Is an NPI Always an NPI?

Now to the core of the matter. As you should all be aware we can, talking about base tablespaces, only create Universal Table Spaces (UTS) in the three flavors we know and love – Partitioned by Growth, Partitioned by Range, and Partitioned by Range Relative Page Numbering.

And???

Well, imagine you have a TS with MAXPARTITIONS = 1 – by definition it cannot go to multiple partitions but all indexes created on it are also by definition NPIs. This is also sort of true for MAXPARTITIONS > 1 if the current number of defined and existing partitions (NUMPARTS) is equal to one.

“Fake” NPIs

These are therefore, in my humble opinion, fake NPIs – You could argue that the NUMPARTS = 1 is “risky”, as Db2 could add a new partition by the next INSERT/UPDATE, but I think the risk is low. For these cases I think the jobs should not go to special NPI handling.

What are your Db2 maintenance jobs doing with these fake NPIs?

I would be very interested to hear from you all about your thoughts and plans here!

TTFN

Roy Boxwell

2021-08 ICI – Db2 12 Update

This month, I would like to review the ICIs that we have had for a few releases plus those that have recently appeared, and then the trouble with twelve …

It began …

Db2 development realised that something had gone horribly wrong when a bunch of Db2 users suddenly found that the output from their queries was no longer what it should be … After a bit of digging, the CHAR format rewrite was found to be the root cause and the fix was hastily created – BIF_COMPATIBILITY ZPARM.

What’s in a name?

Well, then VARCHAR happened and along came a very unpleasent problem with JAVA timestamps and, as I have documented in earlier BLOGs, it all started getting silly with one ZPARM being used for multiple format problems.

Along came the ICI (Incompatible Change Indicator)

So in Db2 10 we got a new IFCID, the 366, which was spat out at *every* prepare (bind) of any SQL that, possibly, contained an ICI. Now we started off pretty small with just three ICI’s: the first two being the reformatted output of CHAR and VARCHAR and the third being the TIMESTAMP format problem.

Db2 10 updates for Db2 11

Here they brought out numbers four to nine to handle all the little changes in Db2 11 so that you got the alert in Db2 10 before it bit you in Db2 11 – all well and good.

Db2 11 updates

The big change, was the brand new IFCID 376 – which is the evil twin of the 366. The only difference being that Db2 cached the entries, so you basically got a rolled up 366 – apart from one tiny little detail. The Execution count was missing. For the 366 it is 1:1, but for the 376 it is 1:nnnnn which could be any positive integer. They then added the 11nn range, going all the way up to 1111, and then they brought in 1112 for empty XML tags. Now all of these have been discussed in my earlier blogs.

What’s new in the ICI World?

Db2 12 of course! They brought out 1201 very early on due to POWER causing a problem. The output on overflow changed from a negative to a positive SQLCODE, which can of course cause “problems”… Why did this change even happen? IBM rewrote the code from using LE 32 bit assembler math calls to using C, and so the function “knew” if it overflowed and could return a warning saying so, whereas the 31 bit assembler just died a death and you got a negative SQLCODE.

Naming Convention?

Then it went quiet for a while until something weird happened: 1215031 and 1215032 appeared. Now, at first, I liked the idea of putting the FL into the ICI, but then I realised it was actually pretty pointless and just made it more confusing !

1215031 is issued when you could qualify a row with NULL in the DATA CHANGE OPERATION column using the FOR SYSTEM_TIME FROM/BETWEEN predicate on a system period temporal table with AUDITING.

1215032 is issued when you attempt to call stored procedure SYSPROC.SET_MAINT_MODE_RECORD_NO_TEMPORALHISTORY as this is no longer supported for data replication calls.

1204 (note the FL has gone…) is issued when you use CURRENT_SERVER or CURRENT_TIMEZONE as a column or variable name.

In the Docu it says 1202, but you actually get a 1215031 – and there’s no mention of 1215032 and 1204!

More new ones

Meanwhile, not (nothing?) to do with Db2 12, IBM also brought out

11 for using SELECT INTO syntax with a UNION

[12 was thankfully skipped!]

13 for INSERT/UPDATE/DELETE using an attribute WITH UR

Both of these were more parsing bugs than anything evil, but both require code changes if they appear!

Utility Time

Now IBM have enhanced the LOAD utility with LOAD FORMAT DELIMITED for correct packed numeric data support when one or more virtual decimal digits exist. This has caused another problem, as the data loaded could be viewed (as it was by one of our customers), as inconsistent. IBM then created another APAR to roll back the change and introduce a new ZPARM LOAD_DEL_IMPLICIT_SCALE to control how these numbers should be loaded. Default is NO, like it used to be, with an implied decimal point at the far right of the data. At the same time, it will now alert users that they could have an incompatibility with the LOAD by changing IFCID 25 to set a new bit. This warns you that you have done a LOAD into a table where there is packed decimal data with one or more digits after the virtual decimal point. If YES, then load interprets the Scale setting in the LOAD statement. For details please see APARs PH28104 and PH36908.

Pain Point for you?

The above mentioned new notification is a bit strange (pardon the pun), as there is already the IFCID 376 for incompatibilities. Now you must also start the IFCID 25 and go checking bits. So my question to you all is: Have you got this problem and, if so, do you think it is worth it to integrate IFCID 25 bit checking into SOFTWARE ENGINEERINGs/SEGUSs current ICI/BIF Use Case in our WorkLoad Expert and/or our BIF/ICI Freeware software?

The future is bright

As far as ICIs are concerned they just keep on rolling!

As always if you have any comments, especially with regard to IFCID 25, please feel free to e-mail!

TTFN

Roy Boxwell

2021-07 IDUG NA 2021 Virtual – A Roy round-up

This month I would like to “roll through” the recently held IDUG NA Virtual 2021 – Another virtual conference, but still full of tons of good stuff! Read on dear reader to see what piqued my interest!

Do you like being Profiled?

Profiling is getting much better (not easier to use mind you!) The best bit was “Remove ‘ping’ and validation queries” and the two easiest ways to do this are either to enable Sysplex WLB, or to use the isVALID() JAVA method. The typical SELECT * FROM SYSIBM.SYSDUMMY1 is a real pain …

PBG Sad future…

The idea of PBGs has, from my POV, changed quite dramatically. From the “Set it to 100 MAXPARTITIONS and forget it” to “Set MAXPARTITIONS to 1 and DSSIZE to 64GB”. I still think this is more of a design flaw than a good idea. If the spacemap search is such a performance killer – fix it …

Performance tidbits

Just using the “new” feature PCTFREE xx FOR UPDATE yy or the PCTFREE_UPD Zparm is a bit of a no-brainer if you have variable length or compressed records and lots of updates that then cause an Overflow record, (better called an indirect reference) – these bloat your data and cause more getpages over time. A sort of creeping death! The only cure is periodic REORGs to remove all of these indirect references, but it is better to delay this as long as possible – hence the FOR UPDATE percentage is a very nice feature!

From Db2 11, you could make your DGTTs NOT LOGGED which can give dramatic performance boosts! Naturally, it is a code change in the create DDL. You must add NOT LOGGED to the back end of the DECLARE statement but there have been really amazing savings like 60% elapsed time reduction and greater than 20% cpu savings! Of course, your mileage may vary. Always test these things before blindly changing all of your DECLARE statements!

zHyperWrite: If you are running with DS8000 and Metro Mirror (PPRC) then this really is a great idea! Update the IECIOSxx HYPERWRITE=YES, issue the SETIO HYPERWRITE=YES z/OS command and set ZPARM REMOTE_COPY_SW_ACCEL to ENABLE to get large reductions in active log writing times! Note that HyperSwap must be enabled as well.

zHyperLink (Baby brother of zHyperWrite) is also fantastic if you have intensive reads (especially sync reads) and lots of active log writing. (Who does *not* have this combo???) Set the zHyperLink to ENABLE for both read and write, DATABASE for read only or ACTIVELOG for write only. Downside is, you may see cpu increase slightly but the payback of elapsed reduction (over 20%) is well worth it.

Too many statistics can be bad for you

Also heard a few times was the statement “Never collect STATISTICS as part of REORG INDEX” – I have firmly believed in this for years and am very happy that it is now the modern mantra!

Run RUNSTATS rarely, if at all! How often does the statistical data actually change? During a REORG – Always! But “on its own” ? Why? Creation/Change of index – Naturally. Mass delete and LOAD with brand new data – Clearly! Insert another million rows? Really needed?? Go logarithmic like our RTDX RunstatsAvoidance feature.

Utility updates

High speed REORG: SORTDATA NO RECLUSTER NO – Just for actioning DDL as fast as possible. Downside is the CLUSTERRATIOF might be killed and, even worse in my humble opinion, all of the RTS counters are set to zero! Of course, if you are clever, you could simply back up the relevant RTS rows before the REORG and replace them afterwards…

REORG got a parallelism boost when building PIs (PH28092 & PH33859), naturally only when no NPIs exist. If running on the z15, think about setting ZPARM UTILS_USE_ZSORT (PH28183) to improve performance.

LOAD got a bunch of improvements (PI96136, PH11255, PH19073, PH23105) and the capability to override row change timestamps with PH25572. It also got the ability to skip the MAXASSIGNEDVALUE update in Db2 12 with PH28476.

RUNSTATS got the new STATCLGSRT ZPARM (PI74408 and PH03678) which then added the STATCLGMEMSRT parameter. RUNSTATS also got an update to avoid “overdoing things”:

  • If BOTH or HISTOGRAM then the COUNT is limited to be 100
  • If sampling value is less than 50% and FREQVAL COUNT BOTH or LEAST is specified then change it to be MOST

These avoid flooding the SYSCOLDIST with basically useless data that just slows down PREPARE and BIND/REBIND as well as possibly causing the optimizer to make a bad judgment.

TEMPLATE support also finally got Large Block Interface! New BLKSZLIM parameter in PH30093.

Compressed?

Generally speaking, random table access will show better performance than sequential table access for compressed data, as the row is only decompressed if required.

Generally speaking, sequential index processing (index scans) works much better than random index access for compressed indexes. The opposite of tables! Why? The entire index page must be read in and decompressed in one call for indexes.

So, once again, know your data, know your access patterns!

New Compressed?

Huffman Object Compression: Came along in FL509 and is well worth a look, as it can give very good results but, as always, test and check before blindly setting all compression to Huffman!

Index room for growth?

On index design the Holy Grail is to set the PCTFREE equal to the number of new index entries between reorg runs divided by the totalentries. FREEPAGE should be used to avoid index page splits. It may also help to increase the index page size and go to index compression.

Last mix

DataSharing OA59122 which went PE and the OA60394 that fixed it, both correct a problem of “contention management” that can be CPU expensive when contention has gone – but what happens next?

Machine Learning – Unsupervised (sounds like me most of the time) … especially “data munging” …

And finally…

A whole ton of really interesting stuff all about deprecated items… oh wait a second, that was my presentation! Anyway, good for a read and learn all about the nasty things lurking in the catalog! Do not forget we have freeware that shows it all to you! MigrationHealthCheck (MHC) is the name!

As always it is a pleasure to hear any comments or criticisms!

TTFN,

Roy Boxwell

2021-06 How large is my system?

After I held my Deprecated Items IDUG talk at the Virtual 2021 North American IDUG one of the attendees asked me if I also had some SQL for quickly retrieving the current “size of the database” – Normally I use our very own SpaceManager product for this as it uses CSI access calls to get up to the second info about the size of the VSAM LDS, however for a quick and dirty query the Db2 catalog can be partially trusted!

Where to begin?

The easiest places to start are the two SYSIBM.SYSxxxxxPART tables. These contain all the partition details and if the PARTITION is zero then it is a non-partitioned tablespace or it is a NPSI defined index.

What to select?

SELECT COUNT(*)                          AS INDEX_PARTS
      ,SUM(IP.SPACEF)                    AS KB
      ,INTEGER(SUM(IP.SPACEF) / 1024)    AS MB
      ,INTEGER(SUM(IP.SPACEF) / 1048576) AS GB
FROM SYSIBM.SYSINDEXES   IX
    ,SYSIBM.SYSINDEXPART IP
WHERE     IX.NAME    = IP.IXNAME
  AND     IX.CREATOR = IP.IXCREATOR
  AND NOT IP.SPACE   < 0
;

This query just shows you the current size of all of your indexes. Notice the NOT IP.SPACE < 0 – This is to not count any DEFINE NO indexes as these use no space of course! I simply added the two INTEGER functions – naturally you can change these to any format you would like.

Table time!

SELECT COUNT(*)                          AS TABLE_PARTS
      ,SUM(TP.SPACEF)                    AS KB
      ,INTEGER(SUM(TP.SPACEF) / 1024)    AS MB
      ,INTEGER(SUM(TP.SPACEF) / 1048576) AS GB
FROM SYSIBM.SYSTABLESPACE TS
    ,SYSIBM.SYSTABLEPART TP
WHERE     TS.NAME   = TP.TSNAME
  AND     TS.DBNAME = TP.DBNAME
  AND NOT TP.SPACE  < 0
;

This does the same but now for tableparts of course!

More detail?

Now these two SQLs are OK for quick results but it is probably more interesting to see which index types and which tablespace types are taking up how much space on your disks. So here the expanded queries follow:

SELECT COUNT(*)                          AS INDEX_PARTS
      ,IX.INDEXTYPE
      ,IX.IX_EXTENSION_TYPE
      ,IX.PAGENUM
      ,SUM(IP.SPACEF)                    AS KB
      ,INTEGER(SUM(IP.SPACEF) / 1024)    AS MB
      ,INTEGER(SUM(IP.SPACEF) / 1048576) AS GB
FROM SYSIBM.SYSINDEXES   IX
    ,SYSIBM.SYSINDEXPART IP
WHERE      IX.NAME    = IP.IXNAME
   AND     IX.CREATOR = IP.IXCREATOR
   AND NOT IP.SPACE   < 0
GROUP BY IX.INDEXTYPE
        ,IX.IX_EXTENSION_TYPE
        ,IX.PAGENUM
ORDER BY IX.INDEXTYPE
        ,IX.IX_EXTENSION_TYPE
        ,IX.PAGENUM
;

The INDEXTYPE column can be Blank (For ancient Type 1 indexes which I hope no-one has), 2 for “normal” Type 2 Indexes, D for Data-partitioned Secondary Indexes (DPSIs) and P for Indexes which are both Partitioned and Partitioning (which I also hope no-one has anymore!)

What’s in a TYPE?

The IX_EXTENSION_TYPE column can be Blank for “normal” indexes, N for Node ID indexes on XML spaces, S for Scalar Expressions, T for Spatial or V for real XML Indexes.

Using new features yet?

PAGENUM is just telling you how many of your indexes are enjoying the great feature of fully variable DSSIZE, adjustable on-the-fly with no outage (As long as you adjust upwards of course!) If set to “A” for Absolute then sadly not, much better is “R” for Relative!

SELECT COUNT(*)                          AS TABLE_PARTS
      ,TS.TYPE
      ,TS.PAGENUM
      ,SUM(TP.SPACEF)                    AS KB
      ,INTEGER(SUM(TP.SPACEF) / 1024)    AS MB
      ,INTEGER(SUM(TP.SPACEF) / 1048576) AS GB
FROM SYSIBM.SYSTABLESPACE TS
    ,SYSIBM.SYSTABLEPART  TP
WHERE     TS.NAME   = TP.TSNAME
  AND     TS.DBNAME = TP.DBNAME
  AND NOT TP.SPACE  < 0
GROUP BY TS.TYPE
     ,TS.PAGENUM
ORDER BY TS.TYPE
     ,TS.PAGENUM
;

What’s in a tablespace TYPE?

Now at the tablespace level we have, again, the TYPE column which can be Blank for a “normal” table, G for UTS PBG, L for Large, O for LOB, P for XML or R for UTS PBR. Going forward you will have to migrate any of the Blank tablespaces to PBG/PBR and L should be gone anyway!

My Output

When I run these here in Düsseldorf in my little Db2 12 FL508 test system I get this output:

---------+---------+---------+---------+---------+---------+--
INDEX_PARTS                       KB           MB           GB
---------+---------+---------+---------+---------+---------+--
       4453  +0.5951116000000000E+07         5811            5
---------+---------+---------+---------+---------+---------+--
TABLE_PARTS                       KB           MB           GB
---------+---------+---------+---------+---------+---------+--
       3733  +0.3255953100000000E+08        31796           31
---------+---------+---------+---------+---------+---------+-
INDEX_PARTS INDEX IX_EXT PAGE                     KB    MB GB     
            TYPE  TYPE   NUM
---------+---------+---------+---------+---------+---------+-
       3280 2            A   +0.5100220000000000E+07  4980  4
         25 2     N      A   +0.1800000000000000E+05    17  0
          3 2     S      A   +0.1224000000000000E+05    11  0
          8 2     V      A   +0.3744000000000000E+04     3  0
         15 D            A   +0.8112000000000000E+04     7  0
         24 P            A   +0.1392000000000000E+05    13  0
       1098 P            R   +0.7948800000000000E+06   776  0
---------+---------+---------+---------+---------+--------
TABLE_PARTS TYPE PAGENUM                     KB     MB  GB
---------+---------+---------+---------+---------+--------
        363      A      +0.5242134000000000E+07   5119   4 
       2094 G    A      +0.2109695400000000E+08  20602  20
          4 L    A      +0.1312000000000000E+04      1   0
        116 O    A      +0.1186235000000000E+07   1158   1
         25 P    A      +0.2952000000000000E+05     28   0
         33 R    A      +0.6932800000000000E+05     67   0
       1098 R    R      +0.4934048000000000E+07   4818   4

So you can see I already have over 1000 UTS PBR RPN table parts in use! Not bad! As a vendor I cannot just migrate everything of course so those 24 Index Controlled Partitioning indexes and the 33 Absolute Partitions aren’t going anywhere soon!

How does your site look? How are your plans for migrating to UTS PBR RPN going?

As always, I would love to hear the results!

TTFN,

Roy Boxwell