-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues starting ZSS in Zowe V2 #600
Comments
Is this a small test system? Does this system have much other USS/OMVS workload on it? Sorry that there isn't a lot of specifics to 'guess' on here. But maybe there is a memory resource limit issue? Can we get a dump of environment variables? |
Hello! Yep, no problem - it is a small system for sure, though there should be room to adjust if there is a resource shortage of some kind. The workload under USS is certainly increasing, though the OMVS system wide limits are still holding up I'd say. If anything our LE heap/stack values might be set a bit low (I believe lower than default values at least). I've attached our environment vars. Many thanks! |
Hello @Dingmans , could you set You'd need to add the following under
I suspect that the heap is running out and this option will force the launcher to start every component in a separate address space making more room for the heap of zss. If this helps, the HEAP LE options should be adjusted, so that all the components start properly in a single address space. |
Hello @ifakhrutdinov! I tried as you suggested, and observed the same issue with ZSS as described initially. Our LE HEAP is defaulting on 64-bit, and is increased somewhat for 31-bit. HEAP64=((1M,1M,KEEP,32K,32K,KEEP,4K,4K,FREE),OVR) Br, |
In the STC joblog I see a few debug statements, for example: ZWESVUSR DEBUG (zwe-internal-start-prepare,configure_components) export _CEE_RUNOPTS="XPLINK(ON),HEAPPOOLS(ON)" I think it is wierd that we get HEAPPOOLS(ON) as I read somewhere it should be HEAPPOOLS(OFF), which is what is specified in the proc from zowe samplib, and I haven't changed that. I can't find anywhere where I have specified HEAPPOOLS(ON) in my own env profile, the system wide profile for all users under USS or in any configuration file for Zowe as far as I can find/remember. In CEEPRM HEAPPOOL defaults to OFF. I don't know if HEAPPOOLS is related to this issue, but the problem is any attempt from my side to add stack/heap adjustments to _CEE_RUNOPTS via STDENV DD is not picked up it seems. Any idea where those export statements are coming from? //David |
@Dingmans could you attach the latest log? |
Here it is: If you notice the JVMJ9VM015W messages, we probably triggered that by adding MEMLIM parameter to the PROC in this particular startup - to see if it would pick that up at least, we set it too 500M which was too small. Otherwise it looks much the same any other initialization. |
@Dingmans , thanks, it still has |
Yes, sorry - I disabled shareas=no after the initial test as it didn't help with zss, and it seemed to bring some noticeable overhead. Here is a another joblog with shareAs=no |
In the V2 log I can see multiple out-of-memory exceptions in Java, and they're absent in the V3 log. So some memory related issues are definitely present. I don't see why zss terminated in the V3 log. Could you attach it here too? I'm wondering if it's the same ABEND this time. I think it should be |
I might be causing some confusion here. (Why I added that was at suggestion from colleague but I never had any faith in it as a solution so to speak, shouldn't have sent that specific log). ZWE2SZ abends more or less the same second it is started, so I suspect that it never reaches a point where it can write anything. A colleague of mine who can read dumps checked the CEE dump produced by these abends and he points to the stack being the issue. So I think it can be a good step to to try to increase the stack. Zowe documentation also recommends a HEAP64 size of at least (4M,4M.... and I think we currently default to (1M,1M.. Which is why I need to find what is overriding any changes made to _CEE_RUNOPTS that is specified in ZWESLSTC on the STDENV DD. I think it should be fine to specify a new stack & heap parameters there? |
I don't think the values from STDENV will be inherited by the components started by the launcher process. STDENV will be used by the launcher only. You can try adding custom runtime options to the Regarding HEAP64, as far as I know zss is still a 31bit application, so it should affect it. I'm going to have a closer look at the CEE dump... |
Ok, looks like zss reaches some code:
Just to make sure we have all the pieces, could you attach the zss log? I don't think it's been attached in this ticket before. |
It was requested in the call where this was discussed that perhaps debugging should be turned on. ZLDEBUG=ON |
I would add the zss log but its just empty so it haven't been any point so far. I started zowe with ZLDEBUG=ON. Attaching the new joblog: |
@Dingmans Thank you. @JoeNemo the latest log has some JSON&configmgr-related debug messages. Can you please have a look? |
I see some bad characters in the log. So, I am still a little suspicious of character set issues, but not sure if that is causing the bug. The messages are "normal" about JSON Schema Validation , that is the validation goes through all of the JSON without finding anything bad. It's going to be a very long diagnosis without reproducing this bug. Or seeing the memory for the embedded expression which fails to evaluate. I think we should follow what we were discussing about getting the trace on the code in embeddedjs.c |
Greetings, Finally made some progress by setting the components.zss.agent.64bit to true. Now zss is started and the zss log is written. It looks like it came up fine. ZWES1014I ZIS status - 'Ok' (name='ZWESIS_LAB ', cmsRC='0', description='Ok', clientVersion='2') I guess this would only be a problem if we want run some plugin that requires zss to be run in 31-bit amode. Not sure what to make of the root cause still, somehow amode 31 is an issue? |
@Dingmans great news! This either is related to the below the bar storage which, if constrained, affect zss 31-bit since it uses it for stack/heap storage, or, there is just a bug, i.e. something in the init code doesn't play nice.
Yes, if a plug-in isn't built for 64-bit, it won't work. Please keep this issue open for now if possible; we're looking at this. |
Yep, I'm interested in any findings also, so no problem keeping this open. Just ping if you need any more info from my side. Regards, |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but can be reopened if needed. Thank you for your contributions. |
This issue has been automatically closed due to lack of activity. If this issue is still valid and important to you, it can be reopened. |
When an ABEND occurs and there is a user-defined ESTAEX in an LE application, the language environment must be notified via a call to CEE3ERP; that way LE has a chance to handle things like hitting a stack guard page. If we don't call CEE3ERP, things can go terribly wrong. At some point, the ZSS 31-bit build was changed to use XPLINK and the CEE3ERP call in the recovery facility was erroneously limited to non-XPLINK 31-bit LE environments. This commit changes the code to call the CEE3ERP routine in XPLINK 31-bit LE applications. Fixes: * zowe/zss#600 * zowe/zss#736 Signed-off-by: Irek Fakhrutdinov <[email protected]>
When an ABEND occurs and there is a user-defined ESTAEX in an LE application, the language environment must be notified via a call to CEE3ERP; that way LE has a chance to handle things like hitting a stack guard page. If we don't call CEE3ERP, things can go terribly wrong. At some point, the ZSS 31-bit build was changed to use XPLINK and the CEE3ERP call in the recovery facility was erroneously limited to non-XPLINK 31-bit LE environments. This commit changes the code to call the CEE3ERP routine in XPLINK 31-bit LE applications. Fixes: * zowe/zss#600 * zowe/zss#736 Signed-off-by: Irek Fakhrutdinov <[email protected]>
When an ABEND occurs and there is a user-defined ESTAEX in an LE application, the language environment must be notified via a call to CEE3ERP; that way LE has a chance to handle things like hitting a stack guard page. If we don't call CEE3ERP, things can go terribly wrong. At some point, the ZSS 31-bit build was changed to use XPLINK and the CEE3ERP call in the recovery facility was erroneously limited to non-XPLINK 31-bit LE environments. This commit changes the code to call the CEE3ERP routine in XPLINK 31-bit LE applications. Fixes: * zowe/zss#600 * zowe/zss#736 Signed-off-by: Irek Fakhrutdinov <[email protected]>
When an ABEND occurs and there is a user-defined ESTAEX in an LE application, the language environment must be notified via a call to CEE3ERP; that way LE has a chance to handle things like hitting a stack guard page. If we don't call CEE3ERP, things can go terribly wrong. At some point, the ZSS 31-bit build was changed to use XPLINK and the CEE3ERP call in the recovery facility was erroneously limited to non-XPLINK 31-bit LE environments. This commit changes the code to call the CEE3ERP routine in XPLINK 31-bit LE applications. Fixes: * zowe/zss#600 * zowe/zss#736 Signed-off-by: Irek Fakhrutdinov <[email protected]>
When an ABEND occurs and there is a user-defined ESTAEX in an LE application, the language environment must be notified via a call to CEE3ERP; that way LE has a chance to handle things like hitting a stack guard page. If we don't call CEE3ERP, things can go terribly wrong. At some point, the ZSS 31-bit build was changed to use XPLINK and the CEE3ERP call in the recovery facility was erroneously limited to non-XPLINK 31-bit LE environments. This commit changes the code to call the CEE3ERP routine in XPLINK 31-bit LE applications. Fixes: * zowe/zss#600 * zowe/zss#736 Signed-off-by: Irek Fakhrutdinov <[email protected]>
Greetings,
I've installed zowe V2.8 and most components come up fine. But ZSS terminates more or less instantly and go into a restart loop. It seems like ZSS ends before anything is written to the zssServer log as it have been empty so far.
What I can see in joblog:
What I can see on the syslog:
I'm not sure what U4088 REASON=00000075 but it seems storage/pointer related.
I will provide the CEE dump, as well as our zowe.yaml and a joblog.
For the CEE dump I used binary FTP to fetch it from z, I will provide DCB information at bottom if you need to pre-allocate a target dataset.
joblog-ceedmp-yaml.zip
Any help would be appreciated.
CEEDUMP DCB:
Many thanks,
David
The text was updated successfully, but these errors were encountered: