DB2 log saturation during IBM Connections database transfer

April 8, 2016, 8:21 am

≫ Next: IBM Sametime Video Manager start up scripts

≪ Previous: IBM Sametime unsigned WebPlayer plugin in Firefox

During a 4.5 –> 5.5 migration I got the following when running the transfer scripts for METRICS and PEOPLEDB.

[02/03/16 16:33:26.659 CET] com.ibm.db2.jcc.am.SqlTransactionRollbackException: Error for batch element #1: DB2 SQL Error: SQLCODE=-1476, SQLSTATE=40506, SQLERRMC=-964, DRIVER=3.69.49
[02/03/16 16:33:26.659 CET] com.ibm.db2.jcc.am.SqlException: [jcc][103][10843][3.69.49] Non-recoverable chain-breaking exception occurred during batch processing. The batch is terminated non-atomically. ERRORCODE=-4225, SQLSTATE=null
[02/03/16 16:33:26.659 CET] error.executing.transfer
err.dbtransfer.exception.labelclass com.ibm.db2.jcc.am.BatchUpdateException: [jcc][t4][102][10040][3.69.49] Batch failure. The batch was submitted, but at least one exception occurred on an individual member of the batch.
Use getNextException() to retrieve the exceptions for specific batched elements. ERRORCODE=-4229, SQLSTATE=null
com.ibm.db2.jcc.am.BatchUpdateException: [jcc][t4][102][10040][3.69.49] Batch failure. The batch was submitted, but at least one exception occurred on an individual member of the batch.
Use getNextException() to retrieve the exceptions for specific batched elements. ERRORCODE=-4229, SQLSTATE=null

Looking in the db2diag.log I saw the following

2016-02-03-18.49.00.983000+060 E44991171F646        LEVEL: Error
PID     : 2348                 TID : 1580           PROC : db2syscs.exe
INSTANCE: DB2                  NODE : 000           DB   : METRICS
APPHDL : 0-809                APPID: 15.91.29.211.49843.160203173533
AUTHID : DB2ADMIN             HOSTNAME:
EDUID   : 1580                 EDUNAME: db2agent (METRICS) 0
FUNCTION: DB2 UDB, data protection services, sqlpgResSpace, probe:2860
MESSAGE : ADM1823E The active log is full and is held by application handle
          “0-809”. Terminate this application by COMMIT, ROLLBACK or FORCE
          APPLICATION.

2016-02-03-18.49.00.983000+060 E44991819F610        LEVEL: Error
PID     : 2348                 TID : 1580           PROC : db2syscs.exe
INSTANCE: DB2                  NODE : 000           DB   : METRICS
APPHDL : 0-809                APPID: 15.91.29.211.49843.160203173533
AUTHID : DB2ADMIN             HOSTNAME:
EDUID   : 1580                 EDUNAME: db2agent (METRICS) 0
FUNCTION: DB2 UDB, data protection services, sqlpgResSpace, probe:6666
MESSAGE : ZRC=0x85100009=-2062548983=SQLP_NOSPACE
          “Log File has reached its saturation point”
          DIA8309C Log file was full.

It means that the DB2 transaction log has become full which you can get information of from the following URLs

http://www-01.ibm.com/support/docview.wss?uid=swg21623212

http://www-01.ibm.com/support/docview.wss?uid=swg21617184

http://bpmadmin.blogspot.com/2014/04/db2-sql-error-sqlcode-1476.html

To get the data transferred I used the following values (the values you need may differ) and commands

db2 update db cfg for metrics using LOGFILSIZ 10000
db2 update db cfg for metrics using LOGPRIMARY 80
db2 update db cfg for metrics using LOGSECOND 40

db2stop
db2start

db2 get db cfg for metrics
Log file size (4KB)                         (LOGFILSIZ) = 10000
Number of primary log files                (LOGPRIMARY) = 80
Number of secondary log files               (LOGSECOND) = 40

I was then able to run the transfer for both these databases.

You may want to change the values back to the default values as it will have an impact on disk space and possibly performance.

↧

IBM Sametime Video Manager start up scripts

April 22, 2016, 1:56 am

≫ Next: CCM file downloads via IHS randomly does not work

≪ Previous: DB2 log saturation during IBM Connections database transfer

I managed to get my hands on a restart script from IBM PMR L3 to start up SolidD and the Video Manager at OS start up and thought that I should share it since it can be a little daunting trying to put together a script on an OS that for some may be quite new to them.

The Video Manager uses SolidDB which needs to be be started first before WAS starts. This involves creating start up scripts, registering them with chkconfig and then changing the start up order.

These scripts are designed for Linux so RHEL (or CentOS). I don’t believe they will work for SUSE Linux Enterprise Server (SLES).

The script for WAS will allow you to stop the application server but it will not allow me to stop SolidDB that needs to be done manually. I’m sure it can be tweaked to work but these are for OS start up and they work for that use case.

standalone_eval_server_start_init.sh

# vi /opt/solidDB/soliddb-7.0/standalone_eval_server_start_init.sh

###################

#!/bin/sh
# *********************************************************************************************************
# ** Description : Shell script to start solidDB evaluation process after machine reboot
# ** Launches solidDB server process with default network listen name: tcp 2315
# ** creates error file boot_error.log in the /opt/solidDB/soliddb-7.0 in case of error
# ** Assumption : 1. Directory /opt/solidDB/soliddb-7.0/eval_kit/standalone is present
# ** : 2. In Directory /opt/solidDB/soliddb-7.0/eval_kit/standalone ,solid.db file is present
# **********************************************************************************************************
SOLID_DIR=/opt/solidDB/soliddb-7.0
today=`date +”%m-%d-%y”`
boot_error_file=$SOLID_DIR/boot_error.log
err_msg_no_dbfile_exist=”No database files solid.db exists in eval_kit/standalone exists , could not start solid db.”
err_msg_dir_path=”Directory structure is not correct . Please check if eval_kit/standalone are present. could not start solid db.”

# Check if the script is started in the right place
if [ -d $SOLID_DIR/eval_kit/standalone ]; then
# locate the executables directory
cd $SOLID_DIR/bin
binpath=`pwd`
cd ..
rootbytes=`pwd | wc -c`
bindir=`echo $binpath | cut -c $rootbytes- | cut -c 2-`

# check if the database exists already
if [ -f $SOLID_DIR/eval_kit/standalone/solid.db ]; then
$bindir/solid -c eval_kit/standalone &

else # default database file did not exist
echo “$today : $err_msg_no_dbfile_exist” >> “$boot_error_file”
exit 1
fi
else # directory structure is not correct
echo “$today : $err_msg_dir_path” >> “$boot_error_file”
exit 1
fi

# End of script.

###################

# chmod +x /opt/solidDB/soliddb-7.0/standalone_eval_server_start_init.sh

SolidDB.init

# vi /etc/init.d/SolidDB.init

###################

#!/bin/sh
#

# IBM Confidential OCO Source Material

# The next lines are for chkconfig on RedHat systems.
# chkconfig: 2345 97 03
# description: Starts and stops Solid db instance \
# instances.
# The next lines are for chkconfig on RHEL systems.
### BEGIN INIT INFO
# Provides: standalone_eval_server_start_init.sh
# Required-Start:
# Required-Stop: $STMediaServer_was.init
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Starts and stops Solid db instance
### END INIT INFO

# START BLOCK
SOLID_DIR=”/opt/solidDB/soliddb-7.0″
solid_init=”standalone_eval_server_start_init.sh”
solid_stop=”standalone_eval_server_stop”
log_file=”/opt/solidDB/soliddb-7.0/boot_log”
today=`date +%Y_%m_%d`
# END BLOCK

RETVAL=0

start_solid()
{
echo “$today” >> $log_file
startCmd=”${SOLID_DIR}/${solid_init}”
if [ -f “${startCmd}” -a -x “${startCmd}” ] ; then
echo “Starting Solid db instance …” >> $log_file
“${startCmd}”
else
echo “Failure starting Solid db instance…” >> $log_file
echo “The service definition may be invalid – script ${startCmd}” >> $log_file
echo “could not be found or was not executable.” >> $log_file
fi
}

stop_solid()
{
echo “$today” >> $log_file
stopCmd=”${SOLID_DIR}/${solid_stop}”
if [ -f “${stopCmd}” -a -x “${stopCmd}” ] ; then
echo “Stopping Solid db instance …” >> $log_file
“${stopCmd}”
else
echo “Failure starting Solid db instance…” >> $log_file
echo “The service definition may be invalid – script ${startCmd}” >> $log_file
echo “could not be found or was not executable.” >> $log_file
fi
}

case “$1” in
start)
shift
start_solid
;;

stop)
shift
stop_solid
;;

restart)
stop_solid
start_solid
;;

*)
echo “Usage: $0 {start|stop|restart}”
exit 1
;;
esac

if [ $RETVAL -ne 0 ]; then
echo exit code: $RETVAL >> $log_file
fi

exit $RETVAL

###################

# chmod 755 /etc/init.d/SolidDB.init
# chkconfig –add SolidDB.init
# chkconfig –level 35 SolidDB.init on

# chkconfig –list | grep -i solid
SolidDB.init 0:off 1:off 2:off 3:on 4:off 5:on 6:off

Video Manager

Change WAS_HOME to match your server.

# vi /etc/init.d/VMgr

###################

#!/bin/bash
#
# apache
#
# chkconfig: 5 90 10
# description: Start up the WebSphere Application Server.
RETVAL=$?
WAS_HOME=”/opt/IBM/WebSphere/AppServer/profiles/HOSTSTMSPNProfile1″
# added line to ensure that environment variables are set correctly
. /etc/profile
case “$1″ in
start)
if [ -f $WAS_HOME/bin/startServer.sh ]; then
echo $”Starting IBM WebSphere STMediaServer”
$WAS_HOME/bin/startServer.sh STMediaServer
fi
;;
stop)
if [ -f $WAS_HOME/bin/stopServer.sh ]; then
echo $”Stop IBM WebSphere STMediaServer”
$WAS_HOME/bin/stopServer.sh STMediaServer -username wasadmin -password *************
fi
;;
status)
if [ -f $WAS_HOME/bin/serverStatus.sh ]; then
echo $”Show status of IBM WebSphere STMediaServer”
$WAS_HOME/bin/serverStatus.sh -all -username wasadmin -password ********
fi
;;
*)
echo $”Usage: $0 {start|stop|status}”
exit 1
;;
esac
exit $RETVAL

###################

# chmod 755 /etc/init.d/VMgr
# chkconfig –add VMgr
# chkconfig –level 35 VMgr on

Start up order

The numbers shown after the slash indicate the start up order. The nearer to zero the sooner it starts up. In the following examples S90VMgr starts up before S97SolidDB.init which is not what is wanted. We want SolidDB to start first so by renaming the files we can manipulate the start up order.

# cd /etc/rc.d
# find . -iname “*solid*”
./rc1.d/K03SolidDB.init
./init.d/SolidDB.init
./rc0.d/K03SolidDB.init
./rc4.d/K03SolidDB.init
./rc6.d/K03SolidDB.init
./rc5.d/S97SolidDB.init
./rc3.d/S97SolidDB.init
./rc2.d/K03SolidDB.init

# find . -iname “*VMgr*”
./rc0.d/K10VMgr
./rc2.d/K10VMgr
./rc6.d/K10VMgr
./rc5.d/S90VMgr
./rc1.d/K10VMgr
./rc3.d/S90VMgr
./init.d/VMgr
./rc4.d/K10VMgr

Change start up order

These steps change the start up order so that SolidDB starts before WAS.

# cd /etc/rc.d/rc3.d/
# mv ./S97SolidDB.init ./S90SolidDB.init
# mv ./S90VMgr ./S97VMgr

# cd /etc/rc.d/rc5.d/
# mv ./S97SolidDB.init ./S90SolidDB.init
# mv ./S90VMgr ./S97VMgr

↧

CCM file downloads via IHS randomly does not work

April 29, 2016, 8:34 am

≫ Next: IBM Connections Metrics graphs stop working after 365 days with CAM-CRP-1098 error about CSK

≪ Previous: IBM Sametime Video Manager start up scripts

I wrote an entry previously about how to configure this and nuances around the syntax but this post is to put out there some very odd behaviour.

I have a working 5.5 environment migrated from 4.5 on Windows 2012 R2. Downloads via IHS work for files over 1MB. On restart of the CCM application server some times downloading via IHS does not work.

When the CCM application servers starts up it write out something like the following. This tells you everything is OK and you can expect CCM files of over 1MB to download OK.

[4/28/16 16:10:07:655 CEST] 00000147 LCDAUtil      I com.ibm.content.fn.app.struts.LCDAUtil getLCDA CDHC enabled
[4/28/16 16:10:07:655 CEST] 00000147 LCDAUtil      I com.ibm.content.fn.app.struts.LCDAUtil getLCDA CDHC cacheId = fncs-cdhc-*******-9445
[4/28/16 16:10:07:655 CEST] 00000147 LCDAUtil      I com.ibm.content.fn.app.struts.LCDAUtil getLCDA CDHC cdhc_rootPath = D:/IBM/Connections/data/shared/ccmcache
[4/28/16 16:10:07:655 CEST] 00000147 LCDAUtil      I com.ibm.content.fn.app.struts.LCDAUtil getLCDA CDHC cdhc_maxCachedFiles = 10000
[4/28/16 16:10:07:655 CEST] 00000147 LCDAUtil      I com.ibm.content.fn.app.struts.LCDAUtil getLCDA CDHC cdhc_maxCacheSize_kb = 10485760
[4/28/16 16:10:07:655 CEST] 00000147 LCDAUtil      I com.ibm.content.fn.app.struts.LCDAUtil getLCDA CDHC cdhc_maxCleanupTime_ms = 5000
[4/28/16 16:10:07:655 CEST] 00000147 LCDAUtil      I com.ibm.content.fn.app.struts.LCDAUtil getLCDA CDHC cdhc_cleanupInterval_s = 3600
[4/28/16 16:10:07:655 CEST] 00000147 LCDAUtil      I com.ibm.content.fn.app.struts.LCDAUtil getLCDA CDHC cdhc_minContentLifetime_s = 3600

On a couple of occasions the application server has been restarted and the above text has not been written to the SystemOut.log. I get the usual “open for e-business” but unless you look for the above lines you won’t know there’s a problem until you get on to the system and more importantly your users try to down load files.

If it doesn’t work you will see something like

[4/27/16 19:09:58:679 CEST] 0000019b LCDAUtil E com.ibm.content.fn.app.struts.LCDAUtil dumpErrorInfo CDHC enabled but configuration error found

You will see that for every time someone tries to down load a plus 1MB CCM file. The user will get a nasty error in the UI. Oddly, if you navigate to the VERSIONS tab of the file you can download it that way. That makes me believe that approach uses WAS to down load the file as opposed to IHS but only from the VERSIONS tab…. Odd.

The way I resolved it was as follows, mind I had to restart the application server twice.

Stopped the CCMCluster server
Renamed D:\IBM\Connections\data\shared\ccmcache\fncs-cdhc-******-9445
Deleted D:\IBM\WebSphere\AppServer\profiles\AppSrv01\temp\Node01\CCMCluster_server1
Started the server
I could not download a previus file and nor could I down load a plus 1MB file after uploading it
The directory in D:\IBM\Connections\data\shared\ccmcache\ was not created
Stopped the CCMCluster server
Deleted D:\IBM\WebSphere\AppServer\profiles\AppSrv01\temp\Node01\CCMCluster_server1
Started the server
Tried uploading various files and then downloading them and then I saw the text written to the systemOut.log and a file downloaded.
All other files worked

It baffled me since I didn’t make any configuration changes to get this working. This has happened twice now and I envisage it will happen again. I have opened a PMR to see whether IBM can shed any light on this.

↧

IBM Connections Metrics graphs stop working after 365 days with CAM-CRP-1098 error about CSK

May 6, 2016, 8:11 am

≫ Next: IBM DB2 NUMDB value changes when migrating IBM Connections databases causing application problems

≪ Previous: CCM file downloads via IHS randomly does not work

A Connections 5.0 CR03 customer logged a ticket because graphs stopped working in Metrics, no data was shown in any community or global Metrics.

In pogo_*.log I found the following errors.

2016-05-06 11:55:59.536 ERROR [.handlers.dispatchercache.CacheFileHandler] WebContainer : 8: Failed during cache file read CAM-CRP-1098 Unable to find the common symmetric key with alias ‘************’ in the keystore ‘/opt/IBM/Cognos/CognosBI/configuration/csk/jCSKKeystore’.

2016-05-06 11:56:15.839 ERROR [.handlers.dispatchercache.CacheFileHandler] WebContainer : 5: Failed to get URL parameters for dispatcher cache file serve CAM-CRP-1098 Unable to find the common symmetric key with alias ‘****************=’ in the keystore ‘/opt/IBM/Cognos/CognosBI/configuration/csk/jCSKKeystore’.

To restore service I restarted Cognos and all was well.

After a little bit of digging it looks like there is a common symmetric key (CSK) and the default for these are 365 days. I looked at the date some of the files and directories were created and they were around a year ago, give or take a couple of days.

Looks like what has happened is that the configuration client has not been opened for a year. On opening and saving the client saves the configuration and also refreshes the CSK key.

After a year the CSK rolls and is recreated causing the errors and the problems I observed. It’s not until a restart that it starts working again having read in the new keys.

To make sure this doesn’t happen again you can open the client using cogconfig.sh (BI and Transformer) and X11 or cogconfig.bat and increase the lifetime value. On save and close the keys will be updated. You should restart Cognos afterwards.

If you’d prefer to avoid the client or can’t X11 then update the following files.

/opt/IBM/Cognos/Transformer/configuration/cogstartup.xml

/opt/IBM/Cognos/CognosBI/configuration/cogstartup.xml

Changing the following value

<crn:parameter name=”CSKLifetime”>
<crn:value xsi:type=”xsd:long”>365</crn:value>
</crn:parameter>

Save and close the file and then run cogconfig.sh -config which will then create cogconfig_response.csv.*date*_*time*.log and in there you can see the keys have been refreshed.

INFO, “[main]”, “Silent Execution Mode (start)”
EXEC, “[main]”, “Loading configuration file”
SUCCESS, “[main]”, “Completed successfully.”
EXEC, “[Cryptography]”, “Generating cryptographic information”
SUCCESS, “[Cryptography]”, “Completed successfully.”
EXEC, “[Save Configuration]”, “Saving configuration parameters”
SUCCESS, “[Save Configuration]”, “Completed successfully.”
INFO, “[main]”, “Silent Execution Mode (end)”

↧

IBM DB2 NUMDB value changes when migrating IBM Connections databases causing application problems

May 10, 2016, 12:54 pm

≫ Next: Sametime 9.0.1 Meeting server update fails due to DEPSTATUS of partial

≪ Previous: IBM Connections Metrics graphs stop working after 365 days with CAM-CRP-1098 error about CSK

A very recent go live of IBM Connections 5.5 from 4.5 resulted in an error affecting Metrics. Metrics was not working what so ever.

A look at the cogserver.logs showed DB2 exceptions. I checked the DB2 client on the Cognos node, it could connect. I noticed that all the daily refreshes failed so it must have been database related.

I checked the value of the numdb having set the value to 25 after the databases were created prior to transfer of the 4.5 data. Running db2 get dbm cfg | find “NUMDB” gave me the value of 15 which is not what I set. I checked my notes and I did set it to 25.

I looked at db2diag.log and could see that the value changed at the time of go live, in fact it had changed after the databases were created and after I had changed the value to 25.

PID     : 7376                 TID : 3468           PROC : db2bp.exe
INSTANCE: DB2                  NODE : 000           DB   : HOMEPAGE
APPID   : *LOCAL.DB2.
HOSTNAME:
EDUID   : 3468
FUNCTION: DB2 UDB, config/install, sqlfLogUpdateCfgParam, probe:30
CHANGE : CFG DBM: “Numdb” From: “25” To: “15”

The above entry was at the same time that activity was happening with HOMEPAGE database.

I checked the scripts I had written and all seemed well. I then checked the lcwizard logs directory and found in C:\Users\db2admin\lcWizard\log\dbWizard\homepage_upgrade-50CR3-55.log the following

UPDATE DBM CFG USING NUMDB 15
DB20000I The UPDATE DATABASE MANAGER CONFIGURATION command completed
successfully.

In ..\Wizards\connections.sql\homepage\db2\upgrade-50CR3-55.sql I found the culprit.

— —————————————————————–
— Defect 164873:
— —————————————————————–

UPDATE DBM CFG USING NUMDB 15@

I really have no idea why IBM would put that in there. There are 16 databases if you include FEBDB. Why have it for HOMEPAGE and not the other database scripts?

I updated the number of databases and then stopped all the application servers and restarted DB2 for good measure and all is well now.

↧

Sametime 9.0.1 Meeting server update fails due to DEPSTATUS of partial

May 19, 2016, 9:25 am

≫ Next: Whiteboard in Sametime 9.0.1

≪ Previous: IBM DB2 NUMDB value changes when migrating IBM Connections databases causing application problems

I updated the SSC, Community server and Sametime Proxy fine but the Meeting server (on Windows) was failing.

In the SSCLogs directory I got the following:

2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.SCLogger init ******************************************************
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.api.Deployment setURLTimeout Cannot run program “env”: CreateProcess error=2, The system cannot find the file specified
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.RestURL getBaseURL 172.xx.xx.xxx
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.RestURL getOutput Is SSLEnabled = true
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.RestURL registerMyTrustManager Entered registerMyTrustManager()
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.RestURL registerMyTrustManager Exit registerMyTrustManager()
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.RestURL getOutput AIDSC0877I: The complete URL is :https://ssc..acme.com:9443/console/deployment/login
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.RestURL getOutput Timeout Set is :60000
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.RestURL registerMyTrustManager Server Certificate for: CN=ssc.acme.com, OU=STAppCell, OU=sscSSCNode, O=IBM, C=US
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.ConnectionThread run AIDSC1482I: Timer is cancelled since URL hit received response.
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.RestURL constructData AIDSC0903I: The data passed to server is :ProductType=com.ibm.lotus.sametime.meetingserver&Hostname=meetings.acme.com&InstallType=PN&ComponentName=
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.RestURL getBaseURL 172.xx.xx.xxx
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.RestURL getBaseURL 172.xx.xx.xxx
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.RestURL sendData AIDSC0877I: The complete URL is :https://ssc.acme.com:9443/console/deployment/getInstallDepId
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.util.ClientUtility getDepId AIDSC0910I: Response from server :<?xml version=”1.0″ encoding=”UTF-8″?><GetDepID><Id></Id></GetDepID>
2016:5:17 22:14:29 com.ibm.sametime.console.deployment.client.api.Deployment getDepId AIDSC0870I: The Deployment ID is :null

I started looking in to “Cannot run program “env”: CreateProcess error=2, The system cannot find the file specified” and “AIDSC0870I: The Deployment ID is :null” and formulated all kinds of theories such as a bug that was trying to run the “env” command which is a Linux command which has a Windows counterpart of “set.” I exhausted my investigation and enabled trace on STConsoleServer (*=info: com.ibm.sametime.*=all) as well as adding log.properties to the logs directory of IBM Installation Manager

I raised a PMR which was quickly turned around.

In the STConsoleServer trace.log the following was found.

[5/19/16 13:21:29:190 BST] 0000033d DeploymentImp > DeploymentImpl getStatus ENTRY
[5/19/16 13:21:29:193 BST] 0000033d DeploymentImp I DeploymentImpl getStatus AIDSC1486I: Size of object 1
[5/19/16 13:21:29:193 BST] 0000033d DeploymentImp I DeploymentImpl getStatus AIDSC1488I: Query Result = 8
[5/19/16 13:21:29:193 BST] 0000033d DeploymentImp < DeploymentImpl getStatus RETURN

In the SSC I saw that the deployment plan had a status of “partial.” I haven’t a clue when this changed.

$ db2 connect to STSC

$ db2 “select * from SSC.DEPLOYMENT where PRODUCTTYPE = ‘com.ibm.lotus.sametime.meetingserver'”

$ db2 “select DEPSTATUS from SSC.DEPLOYMENT where PRODUCTTYPE = ‘com.ibm.lotus.sametime.meetingserver'”

DEPSTATUS
—————————————————————————————————-
8

$ db2 “select DepID from ssc.deployment where DEPSTATUS = ‘8’”

DEPID
————————————————————————————————————————————————————————————————————————————–
14f4cd91acd-00000000000a-MTGSDep

The fix was to change the DEPSTATUS to 1542. This is done by backing up the database first of all and then using the earlier commands to obtain the DepID change the DEPSTATUS

$ db2 “UPDATE ssc.deployment SET DEPSTATUS = ‘1542’ WHERE DepID = ’14f4cd91acd-00000000000a-MTGSDep'”

In the SSC the deployment status is correct.

Now IBM IM passes the validation stage and I can update the Meeting server.

↧

Whiteboard in Sametime 9.0.1

May 20, 2016, 1:24 am

≫ Next: Sametime Proxy web client to web client audio and video

≪ Previous: Sametime 9.0.1 Meeting server update fails due to DEPSTATUS of partial

Having just got over a problem with the Meeting server detailed in Sametime 9.0.1 Meeting server update fails due to DEPSTATUS of partial I wanted to look at the whiteboard feature that is in 9.0.1.

On a recent TechTalk whiteboard was mentioned but it was made clear that it is not supported and is not enabled by default.

The way to enable it is pretty easy.

Enterprise Applications -> Sametime Meeting Server -> Manage Modules

Map Core Whiteboard Services to STMeetingServer

Enterprise Applications -> Sametime Meeting Server -> Virtual Hosts

Map Core Whiteboard Services to the relevant VH

SSC -> Sametime System Console -> Sametime Servers -> Sametime Meeting Servers

Change whiteboard.enabled to be true and I also updated whiteboard.fileio.codebase to a different path.

Sync the node and restart the Meeting server.

When you log in next you’ll see the following.

It’s rather nice. If you draw a shape it will try to auto correct it smoothing out any of the shaky cursor lines to create a circle or square etc

It will be interesting to see why it’s not currently supported or what the implications are of enabling it. It may be that it was added late in the development process and hasn’t been tested thoroughly enough. Nevertheless it will be a useful tool.

I haven’t played much with it but now you can do it for yourselves!

↧

Sametime Proxy web client to web client audio and video

May 27, 2016, 4:59 am

≫ Next: Rich Content widget widget stops working due to mix matched SSL protocols

≪ Previous: Whiteboard in Sametime 9.0.1

In a recent New Way To Learn session hosted by Frank Altenburg he gave us the changes necessary to enable this feature but my brief testing has been mixed.

To enable it you change stproxyconfig.xml in /opt/IBM/WebSphere/AppServer/profiles/STSCDMgrProfile/config/cells/SametimeSSCCell/nodes/*******/servers/STProxyServer/ for Linux adding “<onetoonefeature>true</onetoonefeature>”

Sync the nodes and then restart the Sametime Proxy server.

When you log into Sametime Proxy you’ll see that you need to install the WebPlayer plugin if you haven’t already as shown by the stars next to the two new icons.

You’ll need to accept some pop ups allowing the plugin to run.

My brief testing was mixed. I was testing on two Windows laptops and hadn’t restarted them after the plugin was installed not that it stopped me from using AV in a meeting. In most cases I saw “Call unavailable for Selected Contact” even though they were both web clients with the plugin installed.

I’ll test some more over the weekend. Let me know if anyone gets better results.

Remember this is a technology preview and may not be ready for production use!

↧

Rich Content widget widget stops working due to mix matched SSL protocols

June 21, 2016, 1:41 am

≫ Next: Restart script for IBM Connections Apache Solr aka Type Ahead search

≪ Previous: Sametime Proxy web client to web client audio and video

For an IBM Connections 5.5 customer I found that periodically the Rich Content widget (RTE) stops working showing “The page could not be created due to an error” like the image below.

The following exceptions are seen in the SystemOut.log and access.log.

[6/21/16 10:10:34:889 CEST] 00000282 ConnectContro W org.springframework.social.connect.web.ConnectController oauth2Callback Exception while handling OAuth2 callback (I/O error on POST request for “https://connections.acme.com/oauth2/endpoint/connectionsProvider/token”:Server returned wrong cipher suite for session; nested exception is javax.net.ssl.SSLProtocolException: Server returned wrong cipher suite for session). Redirecting to IBMConnections connection status page.
[6/21/16 10:10:34:904 CEST] 00001811 ServletWrappe E com.ibm.ws.webcontainer.servlet.ServletWrapper service SRVE0014E: Uncaught service() exception root cause rteServlet: org.springframework.web.util.NestedServletException: Request processing failed; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name ‘scopedTarget.ibmConnections’ defined in com.ibm.lconn.rte.config.SocialConfig: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.springframework.social.connections.api.IBMConnections]: Factory method ‘ibmConnections’ threw exception; nested exception is java.lang.IndexOutOfBoundsException
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:982)
at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:861)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:575)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:668)

46.58.58.130 – – [21/Jun/2016:10:10:34 +0200] “GET /oauth2/endpoint/connectionsProvider/authorize?client_id=conn-rte&response_type=code&acls=true&callback_uri=https%253A%252F%252Fconnections.acme.com%252Fconnections%252Frte%252Fcommunity%252F8e3190fe-fb3e-4ebd-83d4-1f319e7a5522%252Fentry&_oauth_client_auto_authorize=true&scope=Connections HTTP/1.1” 302 –
46.58.104.37 – – [21/Jun/2016:10:10:34 +0200] “GET /dm/atom/people/feed?self=true&format=json&anonymousCheck=true&membershipTimestamp=1461057779000&dojo.preventCache=1466496635731 HTTP/1.1” 200 404
46.58.58.130 – – [21/Jun/2016:10:10:34 +0200] “GET /connections/rte/connect?code=iK97VSiDgGJzorfylyBpEVt3OJ800r HTTP/1.1” 202 104
46.58.104.37 – – [21/Jun/2016:10:10:34 +0200] “GET /connections/rte/community/8e3190fe-fb3e-4ebd-83d4-1f319e7a5522/entry?acls=true HTTP/1.1” 500 615

There was an interesting thread in the Socially Integrated blog which was along the same lines but not a match so I raised a PMR. The ever dependable Dave McCarthy and Kevin Holohan shed some light on what was happening.

“According to a chat I had with L3, this is a bug in the JVM. If a connection is established with TLS 1.0 to a specific host (like say host1.ibm.com) then it caches the TLS cipher that was exchanged with the host so that it doesn’t have to do negotiate again. Now if another app in JVM tries to establish connections with TLS 1.2 to the same host (host1.ibm.com) the cached info is reused. This causes the problem. Host1.ibm.com is requesting TLS 1.2 but cached connection in JVM is at TLS 1.0”

It seems that the RTE application is not honouring the protocol as detailed in the Quality of protection (QoP) settings in the ISC and is trying to connect using TLS 1.2. This has triggered Technote How to Force IBM Connections 5.5 CR1 to Use TLSv1.2.

Furthermore I was made aware of LO89164 which is an update to RTE so that it utilises the intraservice URLs for each of the connections components and is be able to direct the HTTP requests issued by RTE component to the appropriate internal endpoint URLs, as referenced in the LCC.xml.

LO89164 can only be applied to CR01 which the customer is not running yet. LO89164 will be included in CR02. This ifix is only available from IBM by logging a PMR, it’s not on Fix Central.

When I am given a maintenance window I need to do the following:

WAS FP8
CR01
Apply LO89164 (included in 5.5 CR2)
Change QoP settings

This should cover off this problem. CR01 also includes other identified bugs in RTE which I won’t go into here but it’s worth applying….

↧

Restart script for IBM Connections Apache Solr aka Type Ahead search

June 22, 2016, 3:08 am

≫ Next: Which JRE to use with IBM Connections Solr?

≪ Previous: Rich Content widget widget stops working due to mix matched SSL protocols

I wanted to put together a script to allow me to start, stop and restart Solr so I managed to cobble together the below version which seems to work pretty well. I am sure there are others who will read this that have better *nix scripting skills who may want to improve it.

In this case I am using JRE that comes with WebSphere but you could use Oracle Java.

# vim /etc/init.d/solr

#!/bin/sh
#
# apache
#
# chkconfig: 5 90 10
# description: Restart script for Apache Solr created by Ben.Williams@chooseportal.com
SOLR_DIR=”/opt/IBM/solr/solr-4.7.2/node1″
JAVA_STOP_OPTIONS=”-DSTOP.PORT=8079 -DSTOP.KEY=mysecret -jar start.jar”
JAVA_START_OPTIONS=”-DSTOP.PORT=8079 -DSTOP.KEY=mysecret -Djetty.ssl.clientAuth=true -Dhost=$(hostname –fqdn) -Dcollection.configName=myConf -DzkRun -DnumShards=1 -Dbootstrap_confdir=/opt/IBM/solr/solr-4.7.2/node1/solr/quick-results-collection/conf -Dsolr.solr.home=/opt/IBM/solr/solr-4.7.2/node1/solr -jar start.jar”
#LOG_FILE=”/opt/IBM/solr/solr-4.7.2/node1/logs/solr.log”
JAVA_HOME=”/opt/IBM/WebSphere/AppServer/java/bin/java”
RETVAL=$?
case $1 in
    start)
        echo “Starting Solr”
        cd $SOLR_DIR
        $JAVA_HOME $JAVA_START_OPTIONS >/dev/null 2>&1 &
        ;;
    stop)
        echo “Stopping Solr”
        cd $SOLR_DIR
        $JAVA_HOME $JAVA_STOP_OPTIONS –stop >/dev/null 2>&1 &
        ;;
    restart)
        $0 stop
        sleep 10
        $0 start
        ;;
    *)
        echo “Usage: $0 {start|stop|restart}” >&2
        exit 1
        ;;
esac
exit $RETVAL

# chmod 755 /etc/init.d/solr
# chkconfig –add solr
# chkconfig –level 35 solr on

Here is an Oracle JRE suitable version.

#!/bin/sh
#
# apache
#
# chkconfig: 5 90 10
# description: Restart script for Apache Solr created by Ben.Williams@chooseportal.com
SOLR_DIR=”/opt/IBM/solr/solr-4.7.2/node1″
JAVA_STOP_OPTIONS=”-DSTOP.PORT=8079 -DSTOP.KEY=mysecret -jar start.jar”
JAVA_START_OPTIONS=”-DSTOP.PORT=8079 -DSTOP.KEY=mysecret -Djetty.ssl.clientAuth=true -Dhost=$(hostname –fqdn) -Dcollection.configName=myConf -DzkRun -DnumShards=1 -Dbootstrap_confdir=/opt/IBM/solr/solr-4.7.2/node1/solr/quick-results-collection/conf -Dsolr.solr.home=/opt/IBM/solr/solr-4.7.2/node1/solr -jar start.jar”
JAVA_HOME=”/usr/lib/jvm/jre/bin/java”
RETVAL=$?
case $1 in
    start)
        echo “Starting Solr”
        cd $SOLR_DIR
        $JAVA_HOME $JAVA_START_OPTIONS >/dev/null 2>&1 &
        ;;
    stop)
        echo “Stopping Solr”
        cd $SOLR_DIR
        $JAVA_HOME $JAVA_STOP_OPTIONS –stop >/dev/null 2>&1 &
        ;;
    restart)
        $0 stop
        sleep 10
        $0 start
        ;;
    *)
        echo “Usage: $0 {start|stop|restart}” >&2
        exit 1
        ;;
esac
exit $RETVAL

↧

Which JRE to use with IBM Connections Solr?

June 22, 2016, 3:32 am

≫ Next: IBM Connections Mail and Ephemeral Diffie-Hellman key size error

≪ Previous: Restart script for IBM Connections Apache Solr aka Type Ahead search

I noticed on another deployment of IBM Connections 5.5 which includes Solr (Type Ahead Search) the Admin dashboard reported less information as before. The difference is that I used the JRE with WebSphere.

It seems that the output provided in the dashboard is less when using the JRE with WebSphere. There is no stipulation of which JRE should be used but it seems sensible to install Oracle Java since some of the data is missing. Take a look at the screen shots below.

Oracle JRE

IBM JRE

↧

IBM Connections Mail and Ephemeral Diffie-Hellman key size error

July 12, 2016, 7:19 am

≫ Next: Forcing TLSv1.2 breaks IBM Connections Surveys and Textbox.io

≪ Previous: Which JRE to use with IBM Connections Solr?

I’m building an IBM Connections 5.5 server to replace our internal Connections server and when configuring the Mail plug-in I came up against problems with the error “Mail server cannot be reached.”

The Domino iNotes server is configured to accept SSL and have SSLv3 disabled via DISABLE_SSLV3=1. SSO works in both directions between the two application servers.

I checked the discoveryservlet URL (https://connections.acme.com/connections/resources/discovery/DiscoveryServlet?email=ben.williams@chooseportal.com) which returned valid data so I know the configuration in socialmail-discovery-config.xml was good but there was very little to go on. Even after I enabled *=info:com.ibm.social.pim.discovery.*=all there was nothing much to go on.

I reached out and Michele Buccarello responded and pointed me towards one of his documents http://www.slideshare.net/michelebuccarello/connections-mail-with-exchange-backend. The document is written primarily for an Exchange server but it describes brilliantly what is happening and a bit of trace that came to my rescue.

I enabled *=info:com.ibm.social.pim.discovery.*=all:com.ibm.cre.*=all and all of a sudden I saw what was happening.

[7/12/16 13:49:33:787 BST] 00000220 CREURLConnect 2 An unhandled exception occured connecting to the target host
javax.net.ssl.SSLException: java.lang.RuntimeException: Could not generate DH keypair

Caused by: java.lang.RuntimeException: Could not generate DH keypair

Caused by: java.security.InvalidAlgorithmParameterException: Prime size must be multiple of 64, and can only range from 256 to 2048 (inclusive)

I read around the various ciphers and I must admit I was a little lost and it’s been a while since I’ve delved deeply into Domino but some googling got me to some Daniel Nashed blogs.

http://blog.nashcom.de/nashcomblog.nsf/dx/first-perfect-forward-secrecy-ciphers-shipped-with-9.0.1-fp3-if2.htm?opendocument&comments#anc1

http://blog.nashcom.de/nashcomblog.nsf/dx/dha-with-more-than-1024-key-size-and-java-still-works.htm?opendocument&comments#anc1

The second had a comment about the Mail plug-in not working so I knew I was getting closer. This put various stackoverflow posts into perspective such as

http://stackoverflow.com/questions/6851461/java-why-does-ssl-handshake-give-could-not-generate-dh-keypair-exception

I stopped Domino and added the following before starting it again and the plug-in started working and I could access my mail and calendar.

SSL_DH_KEYSIZE=1024

I upped the value to 2048 since a previous error said “Prime size must be multiple of 64, and can only range from 256 to 2048 (inclusive).”

On restart of Domino it continued to work. I tried increasing the value to 3072 but this broke the plug-in.

The certificate I was provided was a 4096 bit certificate and not 2048 like I handle more often.

In https://www-10.lotus.com/ldd/dominowiki.nsf/dx/TLS_Cipher_Configuration it states, “By default, these ciphers will use a DH key with a size equivalent to the RSA keysize, so a server running with a 2048 bit SSL certificate would use a 2048 bit DH group.” This means that the DH key being used is 4096 which IBM’s implementation of Java doesn’t support, hence the need to add SSL_DH_KEYSIZE=2048.

I then found the following Domino trace.

DEBUG_SSL_CIPHERS=2
DEBUG_SSL_DHE=2
DEBUG_SSL_HANDSHAKE=2
DEBUG_SSL_IO=0

When I recreate the problem I see in the console.log the following which shows the DH key size.

[11856:00011-1753671424] 07/12/2016 01:50:03.16 PM SSLEncodeDHKeyParams> Server RSA key size 4096 bits
[11856:00011-1753671424] 07/12/2016 01:50:03.16 PM SSLEncodeDHKeyParams> Using a DH key size of 4096 bits
[11856:00011-1753671424] 07/12/2016 01:50:03.26 PM SSLAdvanceHandshake calling SSLPrepareAndQueueMessage> SSLEncodeServerHelloDone
[11856:00011-1753671424] 07/12/2016 01:50:03.26 PM SSLAdvanceHandshake Exit> State HandshakeClientKeyExchange (11)
[11856:00011-1753671424] 07/12/2016 01:50:03.26 PM SSL_Handshake> After handshake state = HandshakeClientKeyExchange (11); Status = -5000
[11856:00011-1753671424] 07/12/2016 01:50:03.26 PM int_MapSSLError> Mapping SSL error -5000 to 4176 [SSLHandshakeNoDone]
[11856:00011-1753671424] 07/12/2016 01:50:03.27 PM SSLProcessProtocolMessage> Record Content: Alert (21)
[11856:00011-1753671424] 07/12/2016 01:50:03.27 PM SSLProcessAlert> Got an alert of 0x50 (internal_error) level 0x2 (fatal)
[11856:00011-1753671424] 07/12/2016 01:50:03.27 PM SSL_Handshake> After handshake2 state HandshakeClientKeyExchange (11)
[11856:00011-1753671424] 07/12/2016 01:50:03.27 PM SSL_Handshake> SSL Error: -6994
[11856:00011-1753671424] 07/12/2016 01:50:03.27 PM int_MapSSLError> Mapping SSL error -6994 to 4171 [SSLFatalAlert]

In https://www-10.lotus.com/ldd/dominowiki.nsf/dx/TLS_Cipher_Configuration it also states, “When using Domino 9.0.1 FP3 IF2 one can and should disable DHE_RSA_WITH_AES_128_CBC_SHA (33) which should make those old clients fall back to using RSA_WITH_AES_128_CBC_SHA (2F) instead.”

I tried the below setting which removes “33” to see whether it worked but it did not. I would like to fiddle more with this to try and find a cipher that WAS and Domino can use in common that avoids setting the DH key too low but I suspect I will run out of time.

SSLCIPHERSPEC=9D9C3D3C352F0A39676B9E9F

BTW – I did all this after I had forced TLS1.2 via How to Force IBM Connections 5.5 CR1 to Use TLSv1.2 which is nice to know that Mail is not broken after enforcing TLS1.2 unlike Textbox.io and Surveys…..

Oh, in Domino when it is successful it will look something like this.

[17825:00011-575325952] 07/12/2016 03:12:00.31 PM SSLEncodeDHKeyParams> Server RSA key size 4096 bits
[17825:00011-575325952] 07/12/2016 03:12:00.31 PM SSLEncodeDHKeyParams> Using a DH key size of 2048 bits
[17825:00011-575325952] 07/12/2016 03:12:00.32 PM SSLEncodeRSAServerKeyExchange> Signing ServerKeyExchange using RSAWithSHA256
[17825:00011-575325952] 07/12/2016 03:12:00.36 PM SSLAdvanceHandshake calling SSLPrepareAndQueueMessage> SSLEncodeServerHelloDone

[17825:00011-575325952] 07/12/2016 03:12:00.41 PM SSL_Handshake> After handshake2 state HandshakeServerIdle (3)
[17825:00011-575325952] 07/12/2016 03:12:00.41 PM SSL_Handshake> Protocol Version = TLS1.2 (0x303)
[17825:00011-575325952] 07/12/2016 03:12:00.41 PM SSL_Handshake> Cipher = DHE_RSA_WITH_AES_256_CBC_SHA (0x0039)
[17825:00011-575325952] 07/12/2016 03:12:00.41 PM SSL_Handshake> KeySize = 256 bits
[17825:00011-575325952] 07/12/2016 03:12:00.41 PM SSL_Handshake> Ephemeral Diffie-Hellman key size = 2048 bits
[17825:00011-575325952] 07/12/2016 03:12:00.41 PM SSL_Handshake> Server RSA key size = 4096 bits
[17825:00011-575325952] 07/12/2016 03:12:00.41 PM SSL_Handshake> TLS/SSL Handshake completed successfully

You won’t see much difference in trace.log.

If anyone has a better way to get around this without changing the value of the DH key size then please shout.

Additional information

I should mention that a useful tip Michele Buccarello pointed me towards taking a Fiddler trace.

You’ll see a call to /connections/opensocial/gadgets/makeRequest. Within that entry in Fiddler I saw 502 Bad Gateway

throw 1; < ‘invalid javascript’ >{“https://webmail.acme.com/mail/bwilliam.nsf/iNotes/Proxy/?OpenDocument&Form=f_SessionInfo_Data&_icmb=20160425-0501”:{“rc”:502,”body”:”&amp;lt;HTML&amp;gt;&amp;lt;TITLE&amp;gt;502&amp;nbsp;-&amp;nbsp;Bad&amp;nbsp;Gateway&amp;lt;/TITLE&amp;gt;&amp;lt;BODY&amp;gt;&amp;lt;h1&amp;gt;502&amp;nbsp;An&amp;nbsp;unhandled&amp;nbsp;exception&amp;nbsp;occured&amp;nbsp;connecting&amp;nbsp;to&amp;nbsp;the&amp;nbsp;target&amp;nbsp;host&amp;lt;/h1&amp;gt;&amp;lt;/BODY&amp;gt;&amp;lt;/HTML&amp;gt;”,”headers”:{“date”:[“Mon, 11 Jul 2016 21:30:46 GMT”],”content-type”:[“text/html; charset=UTF-8″]},”DataHash”:”jslu7s57e7d899jbtr7p1d033g”}}

You can also look at the JSON section to see it in a different format.

The above is also seen in the trace.log with *=info:com.ibm.social.pim.discovery.*=all:com.ibm.cre.*=all

[7/12/16 8:49:30:670 BST] 000001bb CREURLConnect 2 IOException caught, response code is 502, Exception was java.io.IOException: Server returned HTTP response code: 502 for URL: https://webmail.acme.com/mail/bwilliam.nsf/iNotes/Proxy/?OpenDocument&Form=s_ReadViewEntries_JSON&PresetFields=FolderName;($Inbox),UnreadOnly;1,UnreadCountInfo;1,hc;$98&Count=1&resortdescendingpn=$70&TZType=UTC&KeyType=time&_icmb=20160425-0501
[7/12/16 8:49:30:671 BST] 000001bb CREURLConnect 2 Retry error while in streaming mode: 502, java.io.IOException: Server returned HTTP response code: 502 for URL: https://webmail.acme.com/mail/bwilliam.nsf/iNotes/Proxy/?OpenDocument&Form=s_ReadViewEntries_JSON&PresetFields=FolderName;($Inbox),UnreadOnly;1,UnreadCountInfo;1,hc;$98&Count=1&resortdescendingpn=$70&TZType=UTC&KeyType=time&_icmb=20160425-0501

↧

Forcing TLSv1.2 breaks IBM Connections Surveys and Textbox.io

July 12, 2016, 1:38 pm

≫ Next: IBM Connections Mail and Ephemeral Diffie-Hellman key size error – part 2

≪ Previous: IBM Connections Mail and Ephemeral Diffie-Hellman key size error

I had to force TLSv1.2 across all of Connections to fix a problem with RTE as I detailed in Rich Content widget widget stops working due to mix matched SSL protocols but after testing I’ve found that this breaks Textbox.io in Chrome and Surveys.

The process is well documented in How to Force IBM Connections 5.5 CR1 to Use TLSv1.2 but after making the changes the following happens.

Textbox.io

In IE and FF Textbox.io works fine but in Chrome the spell check service fails.

In Fiddler trace is saw Spelling server error: Could not load url “https://connections.acme.com/ephox-spelling/1/correction”: 500 Internal Server Error

In the SystemOut.log I saw

[6/29/16 10:00:38:507 BST] 00000200 SystemOut O ironbark-akka.actor.default-dispatcher-17, RECV TLSv1 ALERT: fatal, handshake_failure
[6/29/16 10:00:38:507 BST] 00000200 SystemOut O ironbark-akka.actor.default-dispatcher-17, fatal: engine already closed. Rethrowing javax.net.ssl.SSLException: Received fatal alert: handshake_failure

spray.can.Http$ConnectionException: Aborted
   at spray.can.client.HttpHostConnectionSlot.reportDisconnection(HttpHostConnectionSlot.scala:228) ~[spray-can_2.11-1.3.3.jar:na]
   at spray.can.client.HttpHostConnectionSlot$$anonfun$connected$1.applyOrElse(HttpHostConnectionSlot.scala:161) ~[spray-can_2.11-1.3.3.jar:na]
   at akka.actor.Actor$class.aroundReceive(Actor.scala:465) ~[akka-actor_2.11-2.3.9.jar:na]

IBM asked me to put on SSL trace, *=info:SSL=all. It seems that the client is sending TLSv1.0 which of course is not allowed now TLSv1.2 has been forced.

[7/11/16 9:31:17:286 BST] 00000115 SystemOut     O   ironbark-akka.actor.default-dispatcher-7, READ: TLSv1.2 Alert, length = 2
[7/11/16 9:31:17:286 BST] 00000115 SystemOut     O   ironbark-akka.actor.default-dispatcher-7, RECV TLSv1 ALERT: fatal, handshake_failure
[7/11/16 9:31:17:286 BST] 00000115 SystemOut     O   ironbark-akka.actor.default-dispatcher-7, fatal: engine already closed. Rethrowing javax.net.ssl.SSLException: Received fatal alert: handshake_failure

IBM have logged a ticket with Ephox as well as investigating it from there end.

Surveys

When in a community with previous surveys I can not see any of the historical surveys nor could I create new ones.

In the SystemOut.log I saw the following

[6/29/16 10:01:56:542 BST] 0000033b StandardExcep E com.ibm.form.nitro.platform.StandardExceptionMapper toResponse eaa8e54e-7c38-4edb-a5ca-bcbd6d7f6c64
com.ibm.form.platform.service.framework.exception.ServicesPlatformException: com.ibm.connections.directory.services.exception.DSException: com.ibm.connections.directory.services.exception.DSOutOfServiceException: javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure

Caused by: com.ibm.connections.directory.services.exception.DSException: com.ibm.connections.directory.services.exception.DSOutOfServiceException: javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure

Caused by: javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure

Again, it seems like it’s sending TLSv1.0.

There’s at least one other person I know of who’s logged a PMR for these problems. It’s fairly urgent due to a problem with the RTE application which is only fixed when TLS1.2 is enforced. I’m hoping that these problems can be resolved sharpish so I can resolve the RTE problem for a customer.

↧

IBM Connections Mail and Ephemeral Diffie-Hellman key size error – part 2

July 18, 2016, 12:30 am

≫ Next: End to Surveys problems in IBM Connections 5.0?

≪ Previous: Forcing TLSv1.2 breaks IBM Connections Surveys and Textbox.io

I wrote about the effects using DHE ciphers can have depending on the size of the SSL certificate used by iNotes when IBM Connections Mail is in play in IBM Connections Mail and Ephemeral Diffie-Hellman key size error

In this blog I suggested the work around was to use the following notes.ini setting.

SSL_DH_KEYSIZE=2048

Our Domino admins weren’t too keen on lowering the key size so I had to look into a way of forcing the server to use a different cipher instead of one of the DHE ciphers.

This is the output from Domino when the DHE cipher is in play.

[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Client requested RSA_WITH_AES_128_CBC_SHA (0x002F)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Best common cipherspec 0x002F (so far)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Best common non-EC cipherspec 0x002F (so far)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Client requested RSA_WITH_AES_256_CBC_SHA (0x0035)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Best common cipherspec 0x0035 (so far)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Best common non-EC cipherspec 0x0035 (so far)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Client requested DHE_RSA_WITH_AES_128_CBC_SHA (0x0033)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Client requested DHE_RSA_WITH_AES_256_CBC_SHA (0x0039)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Best common cipherspec 0x0039 (so far)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Best common non-EC cipherspec 0x0039 (so far)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Client requested DHE_DSS_WITH_AES_128_CBC_SHA (0x0032)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Client requested DHE_DSS_WITH_AES_256_CBC_SHA (0x0038)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Client requested RSA_WITH_3DES_EDE_CBC_SHA (0x000A)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Client requested DHE_RSA_WITH_3DES_EDE_CBC_SHA (0x0016)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Client requested Unknown Cipher (0x0013)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Client requested TLS_EMPTY_RENEGOTIATION_INFO_SCSV (0x00FF)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> TLS_EMPTY_RENEGOTIATION_INFO_SCSV found
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Extensions found in this message
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Processing TLS signature algorithms extension
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> Client supports hash mask 0x007E; server cert chain has mask 0x0030
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> hash/alg in certchain fSupHasAlg:0000
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessClientHello> We selected cipher DHE_RSA_WITH_AES_256_CBC_SHA (0x0039)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLProcessHandshakeMessage Exit> Message: ClientHello (1) State: HandshakeServerIdle (3) Key Exchange: 9 Cipher: DHE_RSA_WITH_AES_256_CBC_SHA (0x0039)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLAdvanceHandshake Enter> Processed: ClientHello (1) State: HandshakeServerIdle (3)
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLAdvanceHandshake client_hello> SGC FLAG: 0 Count = 2
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLAdvanceHandshake calling SSLPrepareAndQueueMessage> SSLEncodeServerHello
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLEncodeServerHello> Sending empty renegotiation_info (0xff01) extension
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLAdvanceHandshake calling SSLPrepareAndQueueMessage> SSLEncodeCertificate
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLEncodeCertificate> Generating a certificate message with 3 certs
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLAdvanceHandshake calling SSLPrepareAndQueueMessage> SSLEncodeServerKeyExchange
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLEncodeDHKeyParams> Server RSA key size 4096 bits
[00403:00011-2285692672] 07/15/2016 11:07:54.94 AM SSLEncodeDHKeyParams> Using a DH key size of 4096 bits
[00403:00011-2285692672] 07/15/2016 11:07:55.01 AM SSLEncodeRSAServerKeyExchange> Signing ServerKeyExchange using RSAWithSHA256
[00403:00011-2285692672] 07/15/2016 11:07:55.04 AM SSLAdvanceHandshake calling SSLPrepareAndQueueMessage> SSLEncodeServerHelloDone
[00403:00011-2285692672] 07/15/2016 11:07:55.04 AM SSLAdvanceHandshake Exit> State HandshakeClientKeyExchange (11)
[00403:00011-2285692672] 07/15/2016 11:07:55.04 AM SSL_Handshake> After handshake state = HandshakeClientKeyExchange (11); Status = -5000
[00403:00011-2285692672] 07/15/2016 11:07:55.04 AM int_MapSSLError> Mapping SSL error -5000 to 4176 [SSLHandshakeNoDone]
[00403:00011-2285692672] 07/15/2016 11:07:55.06 AM SSLProcessProtocolMessage> Record Content: Alert (21)
[00403:00011-2285692672] 07/15/2016 11:07:55.06 AM SSLProcessAlert> Got an alert of 0x50 (internal_error) level 0x2 (fatal)
[00403:00011-2285692672] 07/15/2016 11:07:55.06 AM SSL_Handshake> After handshake2 state HandshakeClientKeyExchange (11)
[00403:00011-2285692672] 07/15/2016 11:07:55.06 AM SSL_Handshake> SSL Error: -6994
[00403:00011-2285692672] 07/15/2016 11:07:55.06 AM int_MapSSLError> Mapping SSL error -6994 to 4171 [SSLFatalAlert]

The idea was to remove the DHE_RSA_WITH_AES_256_CBC_SHA (0x0039) from the list of supported ciphers.

You can do this by dictating all the ciphers Domino uses using the SSLCipherSpec notes.ini setting.

I stopped Domino and added to the notes.ini the following and then started Domino.

SSLCipherSpec=C030009FC02F009EC028006BC014C0270067C013009D009C003D0035003C02F000A

You can see in the string 0039 is not listed. This means that Domino will not use DHE_RSA_WITH_AES_256_CBC_SHA and another cipher will be negotiated.

On restart you can see that the cipher RSA_WITH_AES_256_CBC_SHA is now selected and that is being used which works.

[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLInitContext> Ignoring invalid SSLCipherSpec value F0
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLInitContext> User is forcing 0xFFF3800 cipher spec bitmask for 15 ciphers
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSL_TRUSTPOLICY> bits for signature hashes: 0030
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM int_MapSSLError> Mapping SSL error 0 to 0 [SSLNoErr]
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSL_Handshake> outgoing ->protocolVersion: 0303
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessProtocolMessage> Record Content: Handshake (22)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessHandshakeMessage Enter> Message: ClientHello (1) State: HandshakeServerIdle (3) Key Exchange: 0 Cipher: Unknown Cipher (0x0000)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessHandshakeMessage client_hello> SGC FLAG: 0 CTX state = 3 SGCCount = 0
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> clientVersion: 0303
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> SSL/TLS protocol clientVersion 0x0303, serverVersion 0x0303
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> 10 ciphers requested by client
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Client requested RSA_WITH_AES_128_CBC_SHA (0x002F)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Client requested RSA_WITH_AES_256_CBC_SHA (0x0035)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Best common cipherspec 0x0035 (so far)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Best common non-EC cipherspec 0x0035 (so far)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Client requested DHE_RSA_WITH_AES_128_CBC_SHA (0x0033)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Client requested DHE_RSA_WITH_AES_256_CBC_SHA (0x0039)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Client requested DHE_DSS_WITH_AES_128_CBC_SHA (0x0032)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Client requested DHE_DSS_WITH_AES_256_CBC_SHA (0x0038)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Client requested RSA_WITH_3DES_EDE_CBC_SHA (0x000A)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Client requested DHE_RSA_WITH_3DES_EDE_CBC_SHA (0x0016)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Client requested Unknown Cipher (0x0013)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Client requested TLS_EMPTY_RENEGOTIATION_INFO_SCSV (0x00FF)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> TLS_EMPTY_RENEGOTIATION_INFO_SCSV found
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Extensions found in this message
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Processing TLS signature algorithms extension
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> Client supports hash mask 0x007E; server cert chain has mask 0x0030
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> hash/alg in certchain fSupHasAlg:0000
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessClientHello> We selected cipher RSA_WITH_AES_256_CBC_SHA (0x0035)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessHandshakeMessage Exit> Message: ClientHello (1) State: HandshakeServerIdle (3) Key Exchange: 1 Cipher: RSA_WITH_AES_256_CBC_SHA (0x0035)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLAdvanceHandshake Enter> Processed: ClientHello (1) State: HandshakeServerIdle (3)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLAdvanceHandshake client_hello> SGC FLAG: 0 Count = 2
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLAdvanceHandshake client_hello> Using resumed SSL/TLS Session
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLAdvanceHandshake calling SSLPrepareAndQueueMessage> SSLEncodeServerHello
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLEncodeServerHello> Sending empty renegotiation_info (0xff01) extension
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLAdvanceHandshake calling SSLPrepareAndQueueMessage> SSLEncodeChangeCipherSpec
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLAdvanceHandshake calling SSLPrepareAndQueueMessage> SSLEncodeFinishedMessage
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLCalculateTLS12FinishedMessage Enter> senderID: server finished, PRF using SHA256
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLAdvanceHandshake Exit> State HandshakeChangeCipherSpec (13)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSL_Handshake> After handshake state = HandshakeChangeCipherSpec (13); Status = -5000
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM int_MapSSLError> Mapping SSL error -5000 to 4176 [SSLHandshakeNoDone]
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessProtocolMessage> Record Content: Change cipher spec (20)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSL_Handshake> After handshake2 state HandshakeFinished (14)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM int_MapSSLError> Mapping SSL error -5000 to 4176 [SSLHandshakeNoDone]
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessProtocolMessage> Record Content: Handshake (22)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessHandshakeMessage Enter> Message: Finished (20) State: HandshakeFinished (14) Key Exchange: 1 Cipher: RSA_WITH_AES_256_CBC_SHA (0x0035)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLCalculateTLS12FinishedMessage Enter> senderID: client finished, PRF using SHA256
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLProcessHandshakeMessage Exit> Message: Finished (20) State: HandshakeFinished (14) Key Exchange: 1 Cipher: RSA_WITH_AES_256_CBC_SHA (0x0035)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLAdvanceHandshake Enter> Processed: Finished (20) State: HandshakeFinished (14)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSLAdvanceHandshake Exit> State HandshakeServerIdle (3)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSL_Handshake> After handshake2 state HandshakeServerIdle (3)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSL_Handshake> Using resumed SSL/TLS session
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSL_Handshake> Protocol Version = TLS1.2 (0x303)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSL_Handshake> Cipher = RSA_WITH_AES_256_CBC_SHA (0x0035)
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSL_Handshake> KeySize = 256 bits
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSL_Handshake> Server RSA key size = 4096 bits
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM SSL_Handshake> TLS/SSL Handshake completed successfully
[06035:00011-2986616576] 07/15/2016 12:30:35.85 PM int_MapSSLError> Mapping SSL error 0 to 0 [SSLNoErr]

The string below includes all the ECDHE ciphers which is detailed in https://www-10.lotus.com/ldd/dominowiki.nsf/dx/TLS_Cipher_Configuration but not the DHE cipher that was tripping me up.

SSLCipherSpec=C030009FC02F009EC028006BC014C0270067C013009D009C003D0035003C02F000A

It work’s now and I have tested it with all major browsers. I’m happy and so are the Domino guys too

↧

End to Surveys problems in IBM Connections 5.0?

August 15, 2016, 1:42 am

≫ Next: Sametime and NETWORK_SPRAYER_ADDRESS

≪ Previous: IBM Connections Mail and Ephemeral Diffie-Hellman key size error – part 2

I wrote a blog post Ongoing issues with Surveys (FEB) and IBM Connections which detailed some problems a customer was having with Surveys. This dragged on and resulted in a couple of PMR’s being raised with IBM but I am hopefully at the end of it now.

Recently IBM provided me with a modified .jar to provide additional output when the problem occurred. I needed to add to the ear file. I did this as follows

# cd /opt/IBM/WebSphere/AppServer/profiles/Dmgr01/bin/
# ./wsadmin.sh -lang jython
wsadmin>AdminApp.export(‘Forms Experience Builder’, ‘/tmp/Forms Experience Builder.ear’)
# cp /tmp/Forms\ Experience\ Builder.ear /tmp/Forms\ Experience\ Builder.ear.orig
# mkdir /tmp/feb_expanded
# mkdir /tmp/feb_collapsed
# /opt/IBM/WebSphere/AppServer/bin/EARExpander.sh -ear /tmp/Forms\ Experience\ Builder.ear -operationDir /tmp/feb_expanded/ -operation expand
ADMA4006I: Expanding enterprise archive (EAR) file /tmp/Forms Experience Builder.ear to directory /tmp/feb_expanded/.
# mkdir /tmp/feb_backup
# mv /tmp/feb_expanded/builder.war/WEB-INF/lib/ibm.fsp.core.service.startup-8.0.1.35.jar/ /tmp/feb_backup/
# cp -R /home/ldap/BenW/17891.033.866.ibm.fsp.core.service.startup-8.0.1.81/ /tmp/feb_expanded/builder.war/WEB-INF/lib/ibm.fsp.core.service.startup-8.0.1.35.jar
# /opt/IBM/WebSphere/AppServer/bin/EARExpander.sh -ear ‘/tmp/feb_collapsed/Forms Experience Builder.ear’ -operationDir /tmp/feb_expanded/ -operation collapse
ADMA4007I: Collapsing the contents of directory /tmp/feb_expanded/ to enterprise archive (EAR) file /tmp/feb_collapsed/Forms Experience Builder.ear.
Update the current application using the ISC pointing to /tmp/feb_collapsed/Forms Experience Builder.ear and selecting the default values.

What I found in the SystemOut.log after a period of time was a different error which in the UI was not allowing me to create new surveys but I could complete existing ones which was slightly different to what I was seeing when I raised the PMR. The exception was

[7/11/16 10:34:12:359 BST] 00001b1e StandardExcep E com.ibm.form.nitro.platform.StandardExceptionMapper toResponse ac7d3dec-57f7-482f-83e5-9eaf77c82cbb
java.lang.RuntimeException: Error reading from /tmp/ibm.fsp.temp.1466513524000/fspjars, isDirectory = false, exists = false, canRead = false
at com.ibm.form.platform.service.startup.IsolatingClassLoader.getFileList(IsolatingClassLoader.java:1577)
at com.ibm.form.platform.service.startup.IsolatingClassLoader.access$100(IsolatingClassLoader.java:47)…………..

I created /tmp/ibm.fsp.temp.1466513524000/fspjars and some functionality returned but it wasn’t until I restarted the JVM that it started to work properly.

IBM told me that the problem here is that the /tmp/ directory is getting cleared out and removing the aforementioned directory causing a problem for FEB.

After a bit of Googling I found that tmpwatch was clearing out files/directories that haven’t been edited for 10 days. To stop this I added the bold text.

# vi /etc/cron.daily/tmpwatch
#! /bin/sh
flags=-umc
/usr/sbin/tmpwatch "$flags" -x /tmp/.X11-unix -x /tmp/.XIM-unix \
        -x /tmp/.font-unix -x /tmp/.ICE-unix -x /tmp/.Test-unix \
        -X '/tmp/hsperfdata_*' -X '/tmp/.hdb*lock' -X '/tmp/.sapstartsrv*.log' \
        -X '/tmp/ibm.fsp.*' -X '/tmp/pymp-*' 10d /tmp
/usr/sbin/tmpwatch "$flags" 30d /var/tmp
for d in /var/{cache/man,catman}/{cat?,X11R6/cat?,local/cat?}; do
    if [ -d "$d" ]; then
        /usr/sbin/tmpwatch "$flags" -f 30d "$d"
    fi
done

After a few weeks the problem hadn’t manifested again and IBM told me that the cause of the initial PMR was the /tmp directory being emptied. I was dubious at first but then found https://developer.ibm.com/answers/questions/219765/periodically-my-feb-server-stops-working-properly.html which describes problems due the /tmp directory being cleaned out.

As other stuff gets written to the /tmp directory which is what WAS will use by default I decided to use the java.io.tmpdir custom property to instruct WAS to use a directory under /opt/ where it won’t be cleaned by tmpwatch.

Fingers crossed this is the end of it.

↧

Sametime and NETWORK_SPRAYER_ADDRESS

October 4, 2016, 1:09 pm

≫ Next: SSL certificates and TLSv1.2 for Sametime (but also valid for WebSphere)

≪ Previous: End to Surveys problems in IBM Connections 5.0?

I am planning to move our internal servers to the latest Sametime servers which are using a different LDAP. I have constructed managed-community-configs.xml to redirect the clients to the new Community server sitting behind a DNS alias of sametime.acme.com whilst resetting the client and automatically signing them in to the new servers using Notes client single sign on but I kept having a problem when the client tried to log in to the new Community server.

I found that the client wouldn’t log in to the new server using the Notes single sign on method. This process sends the Notes ID to the Community server (or another Domino server). Domino then checks the Notes ID and then sends back an LtpaToken to the Notes client. The Notes client sends the same LtpaToken to the Community server which Sametime uses to authenticate the user.

This really frustrated me. I couldn’t work out why this was happening. I enabled Wireshark trace and compared a successful connection to another server with the failing one. I found that the packets were similar up until a point and then there was a FIN, ACK. This normally means one of the host terminates the connection which seemed odd.

I spoke with a colleague who is a goldmine of Domino knowledge and after a bit of experimentation we found that when I try opening sametime.acme.com in the Notes client (Ctrl + O) it failed whilst it worked when opening a database for the actual Domino server ie sametime004.acme.com worked.

When putting a trace on the client I got the following:

Using address 'x.x.x.x' for sametime.acme.com on TCPIP
Connected to the wrong server Sametime/Servers/ACME using address x.x.x.x
Using address 'sametime.acme.com' for sametime.acme.com on TCPIP
Unable to connect to sametime.acme.com on TCPIP (The server is not
responding. The server may be down or you may be experiencing network
problems. Contact your system administrator if this problem persists.)

The server document does not have sametime.acme.com listed but sametime004.acme.com since that is the name of the Domino server. I changed the name of the fqhn on the basics tab to sametime.acme.com and restarted the server. Now, I could access a database using sametime.acme.com.

Going back to the Wireshark trace it looks like the FIN, ACK was because the Notes client was stopped from connecting to the Domino server due to the different names.

My colleague then came up with NETWORK_SPRAYER_ADDRESS. This notes.ini value is described here.

When a notes client connects to a Domino server part of the protocol
exchange includes the notes client telling the server what it thinks
the server's name is.If the names do not match, the connection is
terminated. This mechanism is part of the code which supports partitioned
servers running on the same IP address. However, because of this
algorithm, we cannot use network sprayers in front of Domino servers.
When a Notes client uses a Network Sprayer address as a Domino server
address, the network sprayer may make the final connection to any of
the Domino servers behind it. If the name supplied by the client is not
the Domino server name of the selected server, the connection will be
broken. This fix provides a mechanism to skip the server name checking
to allow this configuration to work.

I stopped Domino, added NETWORK_SPRAYER_ADDRESS=* and then started Domino. On testing I could open a database using sametime.acme.com and sametime004.acme.com.

When testing managed-community-configs.xml my Notes client was signed in fine to the new Community server!

The crux is that the problem was because I was using a DNS alias to connect to Domino which didn’t match the actual Domino server name. Sametime doesn’t care normally but the Notes client obviously does. Using NETWORK_SPRAYER_ADDRESS tells Domino not to care and to allow the client to connect.

↧

SSL certificates and TLSv1.2 for Sametime (but also valid for WebSphere)

October 13, 2016, 7:24 am

≫ Next: Sametime photos served up by IHS

≪ Previous: Sametime and NETWORK_SPRAYER_ADDRESS

I thought I’d write this entry after assisting a peer and struggling myself to work out why TLSv1.2 was not working for a given node.

I will detail how to add a wildcard certificate to a Sametime 9.0.1 cell and then how to enforce TLSv1.2 for Sametime Proxy and Meeting server nodes.

Import the SSL certificate

There are various ways to go about this but I will detail using a .p12 file (pcks#12 format). The nice thing about getting a .p12 file is that all the certificates should be in there, all intermediary and the root protected by a password.

There are ways to create .p12 files using openSSL and Google is awash with posts so I won’t go into any more detail.

You will want to export the intermediary and root certificates. You can view the contents of the .p12 using openSSL. I am running Cygwin on a Windows laptop hence the .exe.

openssl.exe pkcs12 -in ./wild_acme_com.p12 -info

This will allow you to copy and paste the intermediary and root certificates which are needed. Again there are commands to export the certificates are available from Google or you could down load them from the Certificate Authority (CA).

Once you have your .p12 and intermediary and root certificates log into the ISC and go to SSL certificate and key management > Key stores and certificates > CellDefaultKeyStore > Personal certificates.

Click Add and add the intermediary and root certificates.

Now go to SSL certificate and key management > Key stores and certificates > CellDefaultKeyStore > Personal certificates > Import and click on key store file.

Point it to your .p12 and enter the password. It will then read the contents and give you a ridiculous name for an alias. I suggest you enter something meaningful. Then press apply.

At which point you will see the chain in SSL certificate and key management > Key stores and certificates > CellDefaultKeyStore > Personal certificates which should look something like this

You can see the chain is complete. This is important otherwise web browsers will show various types of untrusted errors.

If you haven’t done this already you will need to apply the certificate to the nodes that need it.

Go to SSL certificate and key management > Manage endpoint security configurations.

From here you will need to expand the Inbound and Outbound sections for the STProxy and Meeting nodes. If you have a WebSphere proxy in front you will need to apply the certificate to that server. You can also add the certificate to the STProxy or Meeting application server too in case you have users connecting directly.

You need to tick Override inherited values and then press Update certificate alias list at which point in the Certificate alias in key store you should see the alias for the imported .p12. Remember to repeat for both Inbound and Outbound.

Now normally you would stop all application servers, WAS proxies, node agents and then the deployment manager and start them back up but because we are enabling TLSv1.2 we need to do a little more…..

TLSv1.2

If you try to enforce TLSv1.2 on a SIP Proxy Registrar then it will not work properly and you’ll get messages like the following when clients try to connect.

[10/12/16 10:37:24:483 BST] 0000008e TelephonyServ I   UserName in Message
is null
[10/12/16 10:37:31:278 BST] 000000ba SSLHandshakeE E   SSLC0008E: Unable
to initialize SSL connection.  Unauthorized access was denied or security
settings have expired.  Exception is javax.net.ssl.SSLHandshakeException:
Client requested protocol TLSv1 not enabled or not supported

This means that using SSL certificate and key management > Key stores and certificates > CellDefaultKeyStore to control the protocol will not work because it will apply to all application servers in the cell including the SIP Proxy Registrar.

If you have awareness and meetings only then you can get away with it, although you need to take special care with recording of meetings because that will not work if you enforce TLSv1.2. In this case you may need to run the following to add the TLS configuration for recording.

"INSERT INTO mtg.configuration (server_id, CONFIGURATION_KEY,
CONFIGURATION_VALUE) values ('<substitue your server id here>',
'meeting.recording.tlsVersion','TLSv1.2')"

Limitations

Before I go on I will explain what limitation I found. If I enforce TLSv1.2 on the Meeting server I cannot connect to it using a Sametime (thick) client. Web browser and mobile apps work fine. In the thick client it will not connect and I get errors in the client logs.

The default in QoP is SSL_TLS which enables all SSL V3.0 and TLS 1.0 protocols. This is not terribly useful considering I want to use TLSv1.2 but cannot enforce it across all the cell. You can use SSL_TLSv2 which enables all SSL V3.0 and TLS 1.0, 1.1 and 1.2 protocols so at least I have the option of using TLSv1.2 if the client uses that protocol.

So my steps involve some application servers using SSL_TLS, most using SSL_TLSv2 and the Sametime Proxy using TLSv1.2.

Remember I have WebSphere proxies fronting STProxy and Meeting servers to host HTTP -> HTTPS redirection and I will use them as the TLSv1.2 point.

Import p.12 to NodeDefaultKeyStore

So the steps are threefold, 1) add the .p12 certificate to the STProxy server node, 2) set the node to use the NodeDefaultKeyStore and 3) enforce TLSv1.2.

As I have run through the steps to import the certificate to the cell I do not need to run through that again. You need to go to SSL certificate and key management > Key stores and certificates > NodeDefaultKeyStore > Personal certificates/Signer certificates (choosing the node for STProxy) and repeat the steps above.

Now go back to SSL certificate and key management > Manage endpoint security configurations and go to the Inbound and Outbound sections. I made the change on the WebSphere proxy that fronts STProxy.

Change SSL configuration NodeDefaultSSLSettings click update certificate alias list at which point in the Certificate alias in key store you can select the alias you set. Repeat as required.

It will then look something like this. Only was_stpProxy is using the NodeDefaultSSLSettings, all others are using the default, CellDefaultSSLSettings.

The reason why you have done this is important in the next section.

Enforce TLSv1.2

I suggest you stop all the application servers, WebSphere proxies, node agents at this point.

Now you need to enforce TLSv1.2 at the node level. Go to SSL certificate and key management > SSL configurations > NodeDefaultSSLSettings (for STProxy) > Quality of protection (QoP) settings and change Protocol from SSL_TLS to TLSv1.2.

Go to SSL certificate and key management > SSL configurations and for all the other nodes including CellDefaultSSLSettings and XDADefaultSSLSettings set the Protocol to be SSL_TLSv2 including the SIP Proxy Registrar.

On all the nodes find the ssl.client.props file which is somewhere like /opt/IBM/WebSphere/AppServer/profiles/hostSTPPNProfile1/properties/ssl.client.props on Linux.

Ensure this is set as the following default value

com.ibm.ssl.protocol=SSL_TLSv2

This file instructs the client (the node agent) what protocol to communicate with the deployment manager using. As you have set this protocol in QoP for the cell, all nodes (apart from STProxy) and XDADefaultSSLSettings then all node agents can talk freely to the deployment manager.

If you miss a step here you’ll see from the deployment manager’s SystemOut.log that the node agent seems to stop and then start repeatedly. This is because the node agent cannot communicate properly, mainly because you have not changed XDADefaultSSLSettings appropriately.

Stop and start the deployment manager, run syncNode on all nodes and start the node agents, application servers and proxies and test. Check the SystemOut.log for any exceptions and if you see them check your configuration.

Ciphers

If you run a test against your STProxy or Meeting servers you’ll get marked down for the weak ciphers.

You can remove these from SSL certificate and key management > SSL configurations > NodeDefaultSSLSettings > Quality of protection (QoP) settings > Cipher suite settings. You will need to change from Strong to custom and then remove the ciphers listed above, if you so wish.

If you plan to do this for the Meeting server as well as STProxy then you will need to change the Inbound and Outbound options for the WebSphere proxy in front of Meetings so that it uses the NodedefaultSSLSettings which allows you to then use a default set of ciphers.

Finally

I have created a PMR to ask IBM about their support for TLSv1.2 in Sametime. I’ll update things once I get a response.

↧

Sametime photos served up by IHS

October 21, 2016, 3:31 pm

≫ Next: HOMEPAGE.SR_RESUME_TOKENS duplicate data in IBM Connections

≪ Previous: SSL certificates and TLSv1.2 for Sametime (but also valid for WebSphere)

Between customer work I have been working on replacing our internal Sametime servers with shiny new 9.0.1 servers using AD instead of Domino LDAP.

The final piece of the puzzle is photos. Anyone who knows Sametime knows that something as simple as a photo is not made simple by the applications. The Sametime Proxy requires an LDAP attribute (PhotoURL) to be used which points STProxy to the image retrieving it for the client. Meetings doesn’t use the same approach, grr. It can use a binary object saved in LDAP or offload the retrieval to a web server like PhotoURL for STProxy but uses a “string” where all photos must be named joe.bloggs@acme.com.jpg. Confusing? Yep.

I was about to roll over and say it’s not possible but it seems that it is possible to cover all use cases.

Notes/Sametime clients using ImagePath URL
STProxy web client using PhotoURL
Meetings off loading to a web server
Stop external access to photos

The nice thing STProxy does is that it will “proxy” the photos so the web browser doesn’t need direct connectivity to the jpgs. That is great because I can put the photos on an internal facing web server. The STProxy then calls the URL specified in the user’s LDAP entry (PhotoURL), caches it locally and then serves it up. Brilliant, I can lock the photos away so that no one can browse them from the internet if they know our email addresses.

You’ll need to update stproxyconfig.xml adding proxyServerURL otherwise it will not work. Don’t forget to sync and restart STProxy.

<photoCache>
<enabled>true</enabled>
<cacheExpiry>60</cacheExpiry>
<storageLocation>/opt/IBM/phototemp</storageLocation>
<proxyServerURL>https://chat.acme.com</proxyServerURL>
</photoCache>

Ah, the Meeting server doesn’t follow the same logic. Clients (thick or web browser or mobile) need direct access to the photo to render it in the client. This means I’m back to square one….

Let’s jump back a step. How do we get the photos up to a web server?

Photos from Connections

At present our Sametime and Connections servers are using different LDAPs so SSO is not possible and even if it was retrieving photos from Connections via photo.do is not possible for guests because the photos require authentication so using the Connections business card for STProxy and Meetings is a show stopper.

Luckily in the Connections TDISOL there is an AL we can use called dump_photos_to_files. I won’t go into too much details about this but you can copy and paste the AL and then alter it. I altered it to return all user’s email addresses as well as UID and then dump the photos in the format of emailaddress.jpg which is the format needed by the Meeting server.

You may find the email addresses are capitalised. If so you will need to add some JavaScript to the lookup_user process to get it all in lower case

ret.value=conn.getstring(“email”).toLowerCase();

Once you have the photos in the correct format you need to get them from the server running TDI to a web server.

Web server

The logical way to serve the photos is using IHS in front of Connections. To get the files there I needed to scp them from the TDI server to IHS. I had to create ssh-keygens detailed in http://www.linuxproblem.org/art_9.html so I could scp the files wrapped in a shell script. Incidentally , the shell script called the AL and then scp’d the photos to the IHS server. Then add the shell script to cron so it is called on a schedule.

I wanted to lock down access to the photos so that people couldn’t browse to them. This is a little difficult to do but you can use IP ranges for all your internal offices and/or VPNs so that they are allowed to access the photos. The problem is guests who are truly external.

I created a new virtual host in httpd.conf with the following details.

# Sametime photos
<VirtualHost IP_of_IHS:80>
Include /opt/IBM/HTTPServer/conf/rewrite.conf
ServerName photos.acme.com:80
DocumentRoot “/opt/IBM/HTTPServer/photos”
<Location “/photos/”>
Order deny,allow
Deny from all
# Old subnets and staff VPN
Allow from x.x.x.x/x
# Office 1
Allow from x.x.x.x/x
# India
Allow from x.x.x.x/x
# Servers
Allow from x.x.x.x
Allow from x.x.x.x
Allow from x.x.x.x
</Location>
RewriteEngine On
RewriteCond %{HTTP_HOST} !^photos.acme.com$ [NC]
RewriteCond %{HTTP_COOKIE} !LtpaToken2
RewriteCond %{HTTP_COOKIE} !LtpaToken
#RewriteCond %{HTTP_COOKIE} !STPluginActivePage
RewriteCond %{REMOTE_ADDR} !^x\.x\.x\.x$
RewriteCond %{REMOTE_ADDR} !^x\.x\.x\.x$
RewriteCond %{REMOTE_ADDR} !^x\.x\.x\.x$
RewriteCond expr “-R ‘x.x.x.x/x'”
RewriteCond expr “-R ‘x.x.x.x/x'”
RewriteCond expr “-R ‘x.x.x.x/x'”
RewriteRule ^(.*)$ http://www.acme.com [R,L]
RewriteLog logs/rewrite_log
#RewriteLogLevel 9
</VirtualHost>

In a nutshell this allows all clients on certain IP range s to access photos. It also allows any web browser whether it is internal or on the internet to access photos IF it has either one of two cookies, LtpaToken/LtpaToken2 which is provided to the browser when someone authenticates ie an internal user or the cookie STPluginActivePage which the browser stores when you log into a meeting room. STPluginActivePage is in the browser whether you are a guest or an authenticated user, you just need to enter a meeting room.

You need to include LtpaToken here otherwise the Sametime client will not be able to access photos. It seems that the client doesn’t get an LtpaToken2 when it authenticates with the Community server. This may be due to the configuration of the web SSO configuration document which allows both versions. You can configure it to just allow the version two token in which case I’d assume that you’d get LtpaToken2 returned to the client.

If you are a web browser outside of the IP ranges and you do not have either of the two cookies then you will be redirected to http://www.acme.com. You could change this to a static html page of your liking.

I’m no whiz when it comes to Apache but I have tested this quite a bit and it seems pretty secure and should cover most bases. Of course it doesn’t stop a guest from guessing email addresses and browsing other people’s photos but since you have invited them to a meeting, provided them with the meeting room password there is an element of familiarity that should stop them from being malicious in this way. If you back this up with changing the meeting room passwords often you should be in a strong position to keep these photos relatively secure.

If anyone has any thoughts on the httpd.conf I am all ears as I would like to tie it down further if it needs it.

↧

HOMEPAGE.SR_RESUME_TOKENS duplicate data in IBM Connections

November 18, 2016, 11:19 pm

≫ Next: HOMEPAGE.SR_RESUME_TOKENS duplicate data in IBM Connections – proper fix

≪ Previous: Sametime photos served up by IHS

I was checking things after migrating IBM Connections from version 4.0 to 5.5 and found the following error in the application server hosting Search. It didn’t stop the search index and returning results.

[11/18/16 18:46:00:604 GMT] 000001ba XmlBeanDefini I org.springframework.beans.factory.xml.XmlBeanDefinitionReader loadBeanDefinitions Loading XML bean definitions from class path resource [org/springframework/jdbc/support/sql-error-codes.xml]
[11/18/16 18:46:00:627 GMT] 000001ba SQLErrorCodes I org.springframework.jdbc.support.SQLErrorCodesFactory <init> SQLErrorCodes loaded: [DB2, Derby, H2, HSQL, Informix, MS-SQL, MySQL, Oracle, PostgreSQL, Sybase]
[11/18/16 18:46:00:645 GMT] 000001ba IndexingTaskB W com.ibm.connections.search.ejbs.indexing.IndexingTaskBean processTask CLFRW0395E: An error occurred while running the scheduled indexing task named 15min-search-indexing-task.
com.ibm.connections.search.admin.index.exception.IndexingTaskException: org.springframework.jdbc.UncategorizedSQLException: SqlMapClient operation; uncategorized SQLException for SQL []; SQL state [null]; error code [0]; Error: executeQueryForObject returned too many results.; nested exception is java.sql.SQLException: Error: executeQueryForObject returned too many results.

I Googled “returned too many results” and it hinted at duplicate data in databases for different IBM products. Hmmm.

I enabled the following trace and ran a one of indexing task, SearchService.indexNow(“all_configured”)

com.ibm.connections.search.index.indexing.*=all: com.ibm.connections.search.seedlist.*=all: com.ibm.connections.httpClient.*=all

In trace.log I saw more information and just prior to the database exception I saw resume token messages

[11/18/16 18:46:00:580 GMT] 000001ba ResumeTokenIn > com.ibm.connections.search.seedlist.crawler.util.ResumeTokenInterpreter getInitialResumeToken ENTRY wikis
[11/18/16 18:46:00:580 GMT] 000001ba ResumeTokenIn > com.ibm.connections.search.seedlist.crawler.util.ResumeTokenInterpreter resumeTokenFromDate ENTRY Thu Jan 01 01:00:00 GMT 1970 wikis
[11/18/16 18:46:00:580 GMT] 000001ba ResumeTokenIn < com.ibm.connections.search.seedlist.crawler.util.ResumeTokenInterpreter resumeTokenFromDate RETURN AAAAAAAAAAA=
[11/18/16 18:46:00:580 GMT] 000001ba ResumeTokenIn < com.ibm.connections.search.seedlist.crawler.util.ResumeTokenInterpreter getInitialResumeToken RETURN AAAAAAAAAAA=

Resume tokens and references to duplicate data in the database, hmmm. Well HOMEPAGE has the SR_RESUME_TOKENS table. I opened it in dbVisualizer and saw this.

It didn’t look right and compared it with other deployments and found that others only have the one row per application. The knowledge center details how to manipulate them but not clear them.

I shut down all application servers and backed up HOMEPAGE database. I then cleared the table

# su – db2inst1
$ cd /opt2/db2backups/55_homepage_resumetokens/homepage/
$ db2 backup db homepage to ‘/opt2/db2backups/55_homepage_resumetokens/homepage/’
$ db2 connect to homepage
$ db2 “DELETE FROM HOMEPAGE.SR_RESUME_TOKENS WHERE NODE_ID = ‘*****Node01:InfraCluster_server1′”
$ db2 connect reset

On startup the errors have gone and there is only one row per application.

↧

HOMEPAGE.SR_RESUME_TOKENS duplicate data in IBM Connections – proper fix

December 6, 2016, 7:12 am

≫ Next: IBM Connections Mail not working due to Domino view oddness

≪ Previous: HOMEPAGE.SR_RESUME_TOKENS duplicate data in IBM Connections

I wrote a post, HOMEPAGE.SR_RESUME_TOKENS duplicate data in IBM Connections, where I work around the problem by clearing the contents of SR_RESUME_TOKENS. I found that every restart of the JVM hosting Search caused more rows to be added to the table. I raised a PMR and IBM came back and told me that others have raised the same problem and it is due to the fact that constraints are missing. The missing constraints should have been added during the “post” migration process to reapply the constraints after using dbt.jar.

My constraints looked like this:

Whilst they should have looked like this:

I stopped the JVM hosting Search and ran the following DB2 queries

db2 “DELETE FROM HOMEPAGE.SR_RESUME_TOKENS WHERE NODE_ID = ‘xxxxxNode01:InfraCluster_server1′”
db2 “ALTER TABLE HOMEPAGE.SR_RESUME_TOKENS ADD CONSTRAINT “PK_TOKEN_ID” PRIMARY KEY (“TOKEN_ID”)”
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
db2 “ALTER TABLE HOMEPAGE.SR_RESUME_TOKENS ADD CONSTRAINT “FK_RT_IDX_MGMT_ID” FOREIGN KEY (“NODE_ID”) REFERENCES HOMEPAGE.SR_INDEX_MANAGEMENT(“NODE_ID”) ON DELETE CASCADE”
DB20000I The SQL command completed successfully.
db2 “RUNSTATS ON TABLE HOMEPAGE.SR_RESUME_TOKENS”
DB20000I The RUNSTATS command completed successfully.
db2 “RUNSTATS ON TABLE HOMEPAGE.SR_RESUME_TOKENS FOR INDEXES ALL”
DB20000I The RUNSTATS command completed successfully.

On restarting the Search JVM a number of times I found that only one row was created for each application and not multiple as I found previously.

Thanks IBM

↧