Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] UpdateNOC followed by MRP timeout leads to VerifyOrDie #23093

Closed
msandstedt opened this issue Oct 10, 2022 · 6 comments · Fixed by #23097
Closed

[BUG] UpdateNOC followed by MRP timeout leads to VerifyOrDie #23093

msandstedt opened this issue Oct 10, 2022 · 6 comments · Fixed by #23097

Comments

@msandstedt
Copy link
Contributor

msandstedt commented Oct 10, 2022

Reproduction steps

From the log:

[1665412413.509331][7:7] CHIP:DMG: <RE> Sending report (payload has 234 bytes)...
[1665412413.509335][7:7] CHIP:EM: Piggybacking Ack for MessageCounter:11273567 on exchange: 27813r
[1665412413.509340][7:7] CHIP:IN: Prepared secure message 0x5600075952b8 to 0x61EB9EBEBDC5D118 (1)  of type 0x5 and protocolId (0, 1) on exchange 27813r with MessageCounter:57784982.
[1665412413.509342][7:7] CHIP:IN: Sending encrypted msg 0x5600075952b8 with MessageCounter:57784982 to 0x61EB9EBEBDC5D118 (1) at monotonic time: 00000000402FA6CA msec
[1665412413.509360][7:7] CHIP:DMG: <RE> OnReportConfirm: NumReports = 1
[1665412413.509361][7:7] CHIP:DMG: <RE> ReportsInFlight = 1 with readHandler 1, RE has no more messages
[1665412413.509362][7:7] CHIP:DMG: IM RH moving to [AwaitingDestruction]
[1665412413.509364][7:7] CHIP:DMG: All ReadHandler-s are clean, clear GlobalDirtySet
[1665412413.509566][7:7] CHIP:EM: Received message of type 0x10 with protocolId (0, 0) and MessageCounter:11273568 on exchange 27813r
[1665412413.509567][7:7] CHIP:EM: Found matching exchange: 27813r, Delegate: (nil)
[1665412413.509569][7:7] CHIP:EM: Rxd Ack; Removing MessageCounter:57784982 from Retrans Table on exchange 27813r
[1665412413.509570][7:7] CHIP:EM: Removed CHIP MessageCounter:57784982 from RetransTable on exchange 27813r
[1665412413.514097][7:7] CHIP:EM: Received message of type 0x8 with protocolId (0, 1) and MessageCounter:11273569 on exchange 27824r
[1665412413.514102][7:7] CHIP:EM: Handling via exchange: 27824r, Delegate: 0x560006f95a08
[1665412413.514108][7:7] CHIP:DMG: InvokeRequestMessage =
[1665412413.514109][7:7] CHIP:DMG: {
[1665412413.514110][7:7] CHIP:DMG: 	suppressResponse = false, 
[1665412413.514112][7:7] CHIP:DMG: 	timedRequest = false, 
[1665412413.514112][7:7] CHIP:DMG: 	InvokeRequests =
[1665412413.514115][7:7] CHIP:DMG: 	[
[1665412413.514115][7:7] CHIP:DMG: 		CommandDataIB =
[1665412413.514117][7:7] CHIP:DMG: 		{
[1665412413.514118][7:7] CHIP:DMG: 			CommandPathIB =
[1665412413.514119][7:7] CHIP:DMG: 			{
[1665412413.514120][7:7] CHIP:DMG: 				EndpointId = 0x0,
[1665412413.514121][7:7] CHIP:DMG: 				ClusterId = 0x30,
[1665412413.514122][7:7] CHIP:DMG: 				CommandId = 0x0,
[1665412413.514123][7:7] CHIP:DMG: 			},
[1665412413.514124][7:7] CHIP:DMG: 			
[1665412413.514125][7:7] CHIP:DMG: 			CommandFields = 
[1665412413.514126][7:7] CHIP:DMG: 			{
[1665412413.514127][7:7] CHIP:DMG: 				0x0 = 900, 
[1665412413.514128][7:7] CHIP:DMG: 				0x1 = 0, 
[1665412413.514129][7:7] CHIP:DMG: 			},
[1665412413.514130][7:7] CHIP:DMG: 		},
[1665412413.514132][7:7] CHIP:DMG: 		
[1665412413.514133][7:7] CHIP:DMG: 	],
[1665412413.514135][7:7] CHIP:DMG: 	
[1665412413.514136][7:7] CHIP:DMG: 	InteractionModelRevision = 1
[1665412413.514137][7:7] CHIP:DMG: },
[1665412413.514144][7:7] CHIP:DMG: AccessControl: checking f=1 a=c s=0x61EB9EBEBDC5D118 t=0x00000001 c=0x0000_0030 e=0 p=a
[1665412413.514146][7:7] CHIP:DMG: AccessControl: allowed
[1665412413.514148][7:7] CHIP:DMG: Received command for Endpoint=0 Cluster=0x0000_0030 Command=0x0000_0000
[1665412413.514150][7:7] CHIP:FS: GeneralCommissioning: Received ArmFailSafe (900s)
[1665412413.514153][7:7] CHIP:DMG: Endpoint 0, Cluster 0x0000_0030 update version to 154c35a5
[1665412413.514156][7:7] CHIP:DMG: Command handler moving to [ Preparing]
[1665412413.514158][7:7] CHIP:DMG: Command handler moving to [AddingComm]
[1665412413.514160][7:7] CHIP:DMG: Command handler moving to [AddedComma]
[1665412413.514164][7:7] CHIP:DMG: Decreasing reference count for CommandHandler, remaining 0
[1665412413.514166][7:7] CHIP:EM: Piggybacking Ack for MessageCounter:11273569 on exchange: 27824r
[1665412413.514170][7:7] CHIP:IN: Prepared secure message 0x56000758d178 to 0x61EB9EBEBDC5D118 (1)  of type 0x9 and protocolId (0, 1) on exchange 27824r with MessageCounter:57784983.
[1665412413.514172][7:7] CHIP:IN: Sending encrypted msg 0x56000758d178 with MessageCounter:57784983 to 0x61EB9EBEBDC5D118 (1) at monotonic time: 00000000402FA6CF msec
[1665412413.514187][7:7] CHIP:DMG: Command handler moving to [CommandSen]
[1665412413.514188][7:7] CHIP:DMG: Command handler moving to [AwaitingDe]
[1665412413.514294][7:7] CHIP:EM: Received message of type 0x10 with protocolId (0, 0) and MessageCounter:11273570 on exchange 27824r
[1665412413.514308][7:7] CHIP:EM: Found matching exchange: 27824r, Delegate: (nil)
[1665412413.514313][7:7] CHIP:EM: Rxd Ack; Removing MessageCounter:57784983 from Retrans Table on exchange 27824r
[1665412413.514314][7:7] CHIP:EM: Removed CHIP MessageCounter:57784983 from RetransTable on exchange 27824r
[1665412413.517064][7:7] CHIP:EM: Received message of type 0x8 with protocolId (0, 1) and MessageCounter:11273571 on exchange 27833r
[1665412413.517071][7:7] CHIP:EM: Handling via exchange: 27833r, Delegate: 0x560006f95a08
[1665412413.517080][7:7] CHIP:DMG: InvokeRequestMessage =
[1665412413.517082][7:7] CHIP:DMG: {
[1665412413.517084][7:7] CHIP:DMG: 	suppressResponse = false, 
[1665412413.517085][7:7] CHIP:DMG: 	timedRequest = false, 
[1665412413.517086][7:7] CHIP:DMG: 	InvokeRequests =
[1665412413.517089][7:7] CHIP:DMG: 	[
[1665412413.517090][7:7] CHIP:DMG: 		CommandDataIB =
[1665412413.517092][7:7] CHIP:DMG: 		{
[1665412413.517093][7:7] CHIP:DMG: 			CommandPathIB =
[1665412413.517095][7:7] CHIP:DMG: 			{
[1665412413.517096][7:7] CHIP:DMG: 				EndpointId = 0x0,
[1665412413.517098][7:7] CHIP:DMG: 				ClusterId = 0x3e,
[1665412413.517099][7:7] CHIP:DMG: 				CommandId = 0x4,
[1665412413.517100][7:7] CHIP:DMG: 			},
[1665412413.517102][7:7] CHIP:DMG: 			
[1665412413.517104][7:7] CHIP:DMG: 			CommandFields = 
[1665412413.517105][7:7] CHIP:DMG: 			{
[1665412413.517107][7:7] CHIP:DMG: 				0x0 = [
[1665412413.517111][7:7] CHIP:DMG: 						0xdd, 0x57, 0x2f, 0xb9, 0x6c, 0xa8, 0xc1, 0xc6, 0xb9, 0x58, 0x71, 0xce, 0xce, 0xc2, 0x5b, 0x1, 0xf0, 0x32, 0xee, 0x4b, 0xe3, 0xc9, 0x4b, 0x2d, 0x43, 0xd5, 0xb2, 0xe6, 0xde, 0xa6, 0x62, 0x88, 
[1665412413.517112][7:7] CHIP:DMG: 				] (32 bytes)
[1665412413.517114][7:7] CHIP:DMG: 				0x1 = true, 
[1665412413.517115][7:7] CHIP:DMG: 			},
[1665412413.517116][7:7] CHIP:DMG: 		},
[1665412413.517119][7:7] CHIP:DMG: 		
[1665412413.517120][7:7] CHIP:DMG: 	],
[1665412413.517122][7:7] CHIP:DMG: 	
[1665412413.517124][7:7] CHIP:DMG: 	InteractionModelRevision = 1
[1665412413.517125][7:7] CHIP:DMG: },
[1665412413.517133][7:7] CHIP:DMG: AccessControl: checking f=1 a=c s=0x61EB9EBEBDC5D118 t=0x00000001 c=0x0000_003E e=0 p=a
[1665412413.517136][7:7] CHIP:DMG: AccessControl: allowed
[1665412413.517138][7:7] CHIP:DMG: Received command for Endpoint=0 Cluster=0x0000_003E Command=0x0000_0004
[1665412413.517141][7:7] CHIP:ZCL: OpCreds: Received a CSRRequest command
[1665412413.517142][7:7] CHIP:ZCL: OpCreds: Finding fabric with fabricIndex 0x1
[1665412413.517144][7:7] CHIP:EM: Sending Standalone Ack for MessageCounter:11273571 on exchange 27833r
[1665412413.517150][7:7] CHIP:IN: Prepared secure message 0x7ffe1520d0e0 to 0x61EB9EBEBDC5D118 (1)  of type 0x10 and protocolId (0, 0) on exchange 27833r with MessageCounter:57784984.
[1665412413.517153][7:7] CHIP:IN: Sending encrypted msg 0x7ffe1520d0e0 with MessageCounter:57784984 to 0x61EB9EBEBDC5D118 (1) at monotonic time: 00000000402FA6D2 msec
[1665412413.517162][7:7] CHIP:EM: Flushed pending ack for MessageCounter:11273571 on exchange 27833r
[1665412413.517398][7:7] CHIP:ZCL: OpCreds: AllocatePendingOperationalKey succeeded
[1665412413.517460][7:7] CHIP:ZCL: OpCreds: CSRRequest successful.
[1665412413.517463][7:7] CHIP:DMG: Command handler moving to [ Preparing]
[1665412413.517466][7:7] CHIP:DMG: Command handler moving to [AddingComm]
[1665412413.517472][7:7] CHIP:DMG: Command handler moving to [AddedComma]
[1665412413.517477][7:7] CHIP:DMG: Decreasing reference count for CommandHandler, remaining 0
[1665412413.517480][7:7] CHIP:EM: Piggybacking Ack for MessageCounter:11273571 on exchange: 27833r
[1665412413.517486][7:7] CHIP:IN: Prepared secure message 0x56000758ddb8 to 0x61EB9EBEBDC5D118 (1)  of type 0x9 and protocolId (0, 1) on exchange 27833r with MessageCounter:57784985.
[1665412413.517490][7:7] CHIP:IN: Sending encrypted msg 0x56000758ddb8 with MessageCounter:57784985 to 0x61EB9EBEBDC5D118 (1) at monotonic time: 00000000402FA6D2 msec
[1665412413.517502][7:7] CHIP:DMG: Command handler moving to [CommandSen]
[1665412413.517506][7:7] CHIP:DMG: Command handler moving to [AwaitingDe]
[1665412413.518575][7:7] CHIP:EM: Received message of type 0x10 with protocolId (0, 0) and MessageCounter:11273572 on exchange 27833r
[1665412413.518577][7:7] CHIP:EM: Found matching exchange: 27833r, Delegate: (nil)
[1665412413.518580][7:7] CHIP:EM: Rxd Ack; Removing MessageCounter:57784985 from Retrans Table on exchange 27833r
[1665412413.518581][7:7] CHIP:EM: Removed CHIP MessageCounter:57784985 from RetransTable on exchange 27833r
[1665412413.883383][7:7] CHIP:DIS: Failed to parse mDNS query
[1665412413.883491][7:7] CHIP:DIS: Failed to parse mDNS query
[1665412413.883587][7:7] CHIP:DIS: Failed to parse mDNS query
[1665412414.486442][7:7] CHIP:DIS: Broadcasting mDns reply for query from fda6:b25b:1fe5:c58f:81d:2116:1d69:147f
[1665412414.489430][7:7] CHIP:EM: Retransmitting MessageCounter:39876077 on exchange 2397i Send Cnt 1
[1665412414.489455][7:7] CHIP:IN: Sending encrypted msg 0x56000758da58 with MessageCounter:39876077 to 0x61EB9EBEBDC5D118 (1) at monotonic time: 00000000402FAA9E msec
[1665412414.489492][7:7] CHIP:CSM: UpdatePeerAddress: No existing OperationalSessionSetup instance found
[1665412414.489496][7:7] CHIP:CTL: OperationalSessionSetup[1:61EB9EBEBDC5D118]: State change 1 --> 2
[1665412414.740173][7:7] CHIP:DIS: Checking node lookup status after 251 ms
[1665412414.801299][7:7] CHIP:DIS: Failed to parse mDNS query
[1665412414.801405][7:7] CHIP:DIS: Failed to parse mDNS query
[1665412414.801495][7:7] CHIP:DIS: Failed to parse mDNS query
[1665412420.867187][7:7] CHIP:EM: Retransmitting MessageCounter:39876077 on exchange 2397i Send Cnt 2
[1665412420.867252][7:7] CHIP:IN: Sending encrypted msg 0x56000758da58 with MessageCounter:39876077 to 0x61EB9EBEBDC5D118 (1) at monotonic time: 00000000402FC388 msec
[1665412421.350916][7:7] CHIP:EM: Received message of type 0x8 with protocolId (0, 1) and MessageCounter:11273573 on exchange 28700r
[1665412421.350931][7:7] CHIP:EM: Handling via exchange: 28700r, Delegate: 0x560006f95a08
[1665412421.350945][7:7] CHIP:DMG: InvokeRequestMessage =
[1665412421.350947][7:7] CHIP:DMG: {
[1665412421.350948][7:7] CHIP:DMG: 	suppressResponse = false, 
[1665412421.350950][7:7] CHIP:DMG: 	timedRequest = false, 
[1665412421.350951][7:7] CHIP:DMG: 	InvokeRequests =
[1665412421.350954][7:7] CHIP:DMG: 	[
[1665412421.350955][7:7] CHIP:DMG: 		CommandDataIB =
[1665412421.350957][7:7] CHIP:DMG: 		{
[1665412421.350958][7:7] CHIP:DMG: 			CommandPathIB =
[1665412421.350960][7:7] CHIP:DMG: 			{
[1665412421.350962][7:7] CHIP:DMG: 				EndpointId = 0x0,
[1665412421.350963][7:7] CHIP:DMG: 				ClusterId = 0x3e,
[1665412421.350965][7:7] CHIP:DMG: 				CommandId = 0x7,
[1665412421.350966][7:7] CHIP:DMG: 			},
[1665412421.350968][7:7] CHIP:DMG: 			
[1665412421.350970][7:7] CHIP:DMG: 			CommandFields = 
[1665412421.350971][7:7] CHIP:DMG: 			{
[1665412421.350973][7:7] CHIP:DMG: 				0x0 = [
[1665412421.350993][7:7] CHIP:DMG: 						0x15, 0x30, 0x1, 0x11, 0x0, 0x92, 0x7e, 0x97, 0x3b, 0x8a, 0x98, 0x41, 0xae, 0x28, 0x3a, 0x42, 0x57, 0x36, 0xfb, 0x8f, 0x5b, 0x24, 0x2, 0x1, 0x37, 0x3, 0x27, 0x13, 0xfe, 0x6f, 0x1b, 0x35, 0xbd, 0x86, 0x67, 0xc8, 0x18, 0x26, 0x4, 0xb2, 0xdb, 0xd6, 0x2a, 0x26, 0x5, 0x42, 0xd3, 0x3e, 0x2c, 0x37, 0x6, 0x27, 0x11, 0x65, 0x9c, 0xda, 0x9, 0x30, 0x3e, 0x3d, 0x4c, 0x27, 0x15, 0x98, 0xd3, 0xfd, 0x2, 0xf5, 0x2, 0x2, 0x91, 0x18, 0x24, 0x7, 0x1, 0x24, 0x8, 0x1, 0x30, 0x9, 0x41, 0x4, 0x5b, 0xb7, 0xf8, 0xe7, 0x8a, 0x52, 0x80, 0xb4, 0x1f, 0xc1, 0x47, 0xaa, 0xcf, 0x8a, 0xe9, 0x4, 0xa6, 0xc8, 0x42, 0x7b, 0x76, 0x3a, 0xcc, 0xbe, 0x9d, 0xf3, 0x12, 0xf8, 0xd1, 0x41, 0xc2, 0xee, 0x30, 0x26, 0x0, 0xb7, 0x90, 0xf, 0xbb, 0x9, 0x7a, 0x8f, 0x9e, 0x2, 0xde, 0xde, 0x7f, 0xca, 0x32, 0x16, 0x21, 0xc9, 0xd0, 0x82, 0x5c, 0x6f, 0x70, 0x1, 0x8c, 0x51, 0x37, 0x1b, 0x5e, 0x1f, 0x37, 0xa, 0x35, 0x1, 0x28, 0x1, 0x18, 0x30, 0x5, 0x14, 0x58, 0xf0, 0xac, 0x9f, 0x2, 0x52, 0xc1, 0x47, 0x53, 0x43, 0xb8, 0x61, 0x63, 0x3d, 0x7d, 0x43, 0xe8, 0xca, 0xab, 0x6b, 0x30, 0x4, 0x14, 0x1a, 0x59, 0xb7, 0x9e, 0x5a, 0xfa, 0x87, 0x72, 0x6a, 0xf, 0xce, 0xce, 0x26, 0x55, 0xf7, 0xcc, 0x7c, 0x90, 0xd3, 0xb0, 0x24, 0x2, 0x5, 0x36, 0x3, 0x4, 0x2, 0x4, 0x1, 0x18, 0x18, 0x30, 0xb, 0x40, 0x79, 0xa8, 0xf6, 0x22, 0x7b, 0xba, 0x8b, 0x29, 0xf2, 0xad, 0x0, 0xaf, 0xbd, 0x54, 0xd, 0xde, 0xb5, 0xb6, 0x90, 0x4d, 0x1c, 0x47, 0x99, 0x8b, 0xcd, 0xe7, 0xd0, 0xf4, 0xb1, 0xe3, 0x52, 0x39, 0x1e, 0x25, 0x44, 0x54, 0x1c, 0xd6, 0x98, 0xc5, 0x62, 0x1e, 0x86, 0xdd, 0x27, 0x46, 0x2e, 0x3e, 0x4c, 0x62, 0x8c, 0xcf, 0xb9, 0xc5, 0xf1, 0xd1, 0x8b, 0x19, 0xd8, 0x29, 0x9d, 0x5d, 0x2, 0x18, 0x18, 
[1665412421.351000][7:7] CHIP:DMG: 				] (278 bytes)
[1665412421.351002][7:7] CHIP:DMG: 				0x1 = [
[1665412421.351018][7:7] CHIP:DMG: 						0x15, 0x30, 0x1, 0x10, 0x43, 0x26, 0x49, 0xc6, 0x1a, 0x1f, 0x14, 0x65, 0x39, 0x2e, 0x10, 0x8f, 0x4d, 0xa0, 0x80, 0xa1, 0x24, 0x2, 0x1, 0x37, 0x3, 0x27, 0x14, 0xff, 0x5a, 0xc8, 0x11, 0x91, 0x69, 0x47, 0xd7, 0x18, 0x26, 0x4, 0x7b, 0x3b, 0xd3, 0x2a, 0x26, 0x5, 0x23, 0xb, 0x1b, 0x34, 0x37, 0x6, 0x27, 0x13, 0xfe, 0x6f, 0x1b, 0x35, 0xbd, 0x86, 0x67, 0xc8, 0x18, 0x24, 0x7, 0x1, 0x24, 0x8, 0x1, 0x30, 0x9, 0x41, 0x4, 0x58, 0xbc, 0xd, 0x57, 0x32, 0x3, 0xd5, 0xf8, 0xb6, 0xc, 0xf0, 0xa4, 0xdc, 0x7f, 0x96, 0x41, 0x51, 0xf4, 0x7d, 0x18, 0x30, 0xcb, 0x53, 0x6f, 0x85, 0xaf, 0xb6, 0xa, 0x28, 0x50, 0x93, 0x1c, 0x27, 0x79, 0xb7, 0x3d, 0x9f, 0xb2, 0xe7, 0x85, 0x4b, 0xbd, 0x8f, 0x88, 0xbf, 0xfe, 0x73, 0xe4, 0xba, 0x81, 0x38, 0x89, 0xd5, 0xb1, 0xd, 0x2e, 0x61, 0xca, 0x5f, 0x29, 0x5, 0x10, 0x18, 0xe4, 0x37, 0xa, 0x35, 0x1, 0x29, 0x1, 0x24, 0x2, 0x0, 0x18, 0x30, 0x5, 0x14, 0xf3, 0x77, 0x6b, 0x98, 0x3, 0xd4, 0xcd, 0x4c, 0x55, 0x26, 0x9e, 0xf0, 0x1b, 0xd5, 0xb, 0xeb, 0x21, 0x15, 0x26, 0x5, 0x30, 0x4, 0x14, 0x58, 0xf0, 0xac, 0x9f, 0x2, 0x52, 0xc1, 0x47, 0x53, 0x43, 0xb8, 0x61, 0x63, 0x3d, 0x7d, 0x43, 0xe8, 0xca, 0xab, 0x6b, 0x24, 0x2, 0x60, 0x18, 0x30, 0xb, 0x40, 0x46, 0x2b, 0x96, 0xc3, 0xc2, 0xaa, 0x2b, 0x7d, 0x5b, 0xd5, 0xd2, 0xdd, 0xaf, 0x83, 0x83, 0x55, 0x16, 0xf7, 0xd5, 0x12, 0x65, 0xbd, 0x1e, 0x86, 0x14, 0xe2, 0xd9, 0x91, 0x29, 0xe1, 0xb5, 0x96, 0x1c, 0xc8, 0x34, 0xed, 0xda, 0xc3, 0x90, 0xd1, 0xcd, 0x49, 0x58, 0x72, 0x8b, 0xd8, 0x55, 0xaa, 0x3f, 0xee, 0xa4, 0x45, 0x23, 0x45, 0x27, 0x57, 0xd3, 0xea, 0x39, 0x62, 0x13, 0x2b, 0xd, 0x0, 0x18, 
[1665412421.351020][7:7] CHIP:DMG: 				] (263 bytes)
[1665412421.351021][7:7] CHIP:DMG: 			},
[1665412421.351022][7:7] CHIP:DMG: 		},
[1665412421.351025][7:7] CHIP:DMG: 		
[1665412421.351026][7:7] CHIP:DMG: 	],
[1665412421.351028][7:7] CHIP:DMG: 	
[1665412421.351029][7:7] CHIP:DMG: 	InteractionModelRevision = 1
[1665412421.351030][7:7] CHIP:DMG: },
[1665412421.351040][7:7] CHIP:DMG: AccessControl: checking f=1 a=c s=0x61EB9EBEBDC5D118 t=0x00000001 c=0x0000_003E e=0 p=a
[1665412421.351044][7:7] CHIP:DMG: AccessControl: allowed
[1665412421.351046][7:7] CHIP:DMG: Received command for Endpoint=0 Cluster=0x0000_003E Command=0x0000_0007
[1665412421.351050][7:7] CHIP:ZCL: OpCreds: Received an UpdateNOC command
[1665412421.351052][7:7] CHIP:ZCL: OpCreds: Finding fabric with fabricIndex 0x1
[1665412421.351054][7:7] CHIP:EM: Sending Standalone Ack for MessageCounter:11273573 on exchange 28700r
[1665412421.351061][7:7] CHIP:IN: Prepared secure message 0x7ffe1520d1c0 to 0x61EB9EBEBDC5D118 (1)  of type 0x10 and protocolId (0, 0) on exchange 28700r with MessageCounter:57784986.
[1665412421.351065][7:7] CHIP:IN: Sending encrypted msg 0x7ffe1520d1c0 with MessageCounter:57784986 to 0x61EB9EBEBDC5D118 (1) at monotonic time: 00000000402FC56C msec
[1665412421.351092][7:7] CHIP:EM: Flushed pending ack for MessageCounter:11273573 on exchange 28700r
[1665412421.351224][7:7] CHIP:FP: Validating NOC chain
[1665412421.351598][7:7] CHIP:FP: NOC chain validation successful
[1665412421.351604][7:7] CHIP:FP: Updated fabric at index: 0x1, Node ID: 0x4C3D3E3009DA9C65
[1665412421.351607][7:7] CHIP:TS: Last Known Good Time: 2022-10-10T13:30:24
[1665412421.351608][7:7] CHIP:TS: New proposed Last Known Good Time: 2022-10-10T13:33:38
[1665412421.351610][7:7] CHIP:TS: Updating pending Last Known Good Time to 2022-10-10T13:33:38
[1665412421.351614][7:7] CHIP:DMG: Endpoint 0, Cluster 0x0000_003E update version to a0ab1856
[1665412421.351621][7:7] CHIP:DMG: Endpoint 0, Cluster 0x0000_003E update version to a0ab1857
[1665412421.351816][7:7] CHIP:DL: writing settings to file (/tmp/chip_kvs-WKhpQK)
[1665412421.351970][7:7] CHIP:DL: renamed tmp file to file (/tmp/chip_kvs)
[1665412421.351992][7:7] CHIP:DL: writing settings to file (/tmp/chip_kvs-4aYi6K)
[1665412421.359507][7:7] CHIP:DL: renamed tmp file to file (/tmp/chip_kvs)
[1665412421.359570][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359584][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359595][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359605][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359614][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359624][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359633][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359643][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359653][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359662][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359672][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359682][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359696][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359706][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359716][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359725][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359735][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359745][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359754][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359763][7:7] CHIP:SC: Session resumption cache deletion partially failed for fabric index 1, unable to load node state: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.359823][7:7] CHIP:DL: writing settings to file (/tmp/chip_kvs-A70foJ)
[1665412421.359996][7:7] CHIP:DL: renamed tmp file to file (/tmp/chip_kvs)
[1665412421.360002][7:7] CHIP:SVR: Warning, failed to delete session resumption state for fabric index 0x1: ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/platform/Linux/KeyValueStoreManagerImpl.cpp:53: CHIP Error 0x000000A0: Value not found in the persisted storage
[1665412421.360005][7:7] CHIP:DIS: Updating services using commissioning mode 0
[1665412421.361763][7:7] CHIP:DIS: CHIP minimal mDNS started advertising.
[1665412421.362334][7:7] CHIP:DIS: Broadcasting mDns reply for query from 172.17.0.1
[1665412421.386028][7:7] CHIP:DIS: Broadcasting mDns reply for query from 100.73.208.71
[1665412421.388597][7:7] CHIP:DIS: Broadcasting mDns reply for query from 192.168.0.12
[1665412421.390160][7:7] CHIP:DIS: Broadcasting mDns reply for query from fd00::1
[1665412421.423044][7:7] CHIP:DIS: Broadcasting mDns reply for query from fe80::42:b1ff:fe15:f653
[1665412421.506480][7:7] CHIP:DIS: Broadcasting mDns reply for query from fe80::1
[1665412421.511568][7:7] CHIP:DIS: Broadcasting mDns reply for query from fd7a:115c:a1e0:ab12:4843:cd96:6249:d047
[1665412421.513593][7:7] CHIP:DIS: Broadcasting mDns reply for query from fe80::82de:b23d:7bb3:9fbe
[1665412421.515442][7:7] CHIP:DIS: Broadcasting mDns reply for query from fda6:b25b:1fe5:c58f:81d:2116:1d69:147f
[1665412421.517617][7:7] CHIP:DIS: Broadcasting mDns reply for query from fe80::7bf4:29ac:6c11:229b
[1665412421.547517][7:7] CHIP:DL: Using wifi MAC for hostname
[1665412421.547531][7:7] CHIP:DIS: Advertise operational node DBB41BC6CF2C22A9-4C3D3E3009DA9C65
[1665412421.547695][7:7] CHIP:DIS: Responding with _matter._tcp.local
[1665412421.547698][7:7] CHIP:DIS: Responding with DBB41BC6CF2C22A9-4C3D3E3009DA9C65._matter._tcp.local
[1665412421.547702][7:7] CHIP:DIS: Responding with DBB41BC6CF2C22A9-4C3D3E3009DA9C65._matter._tcp.local
[1665412421.547704][7:7] CHIP:DIS: Responding with 0242B115F653.local
[1665412421.547705][7:7] CHIP:DIS: Responding with 0242B115F653.local
[1665412421.547708][7:7] CHIP:DIS: Responding with _IDBB41BC6CF2C22A9._sub._matter._tcp.local
[1665412421.547709][7:7] CHIP:DIS: CHIP minimal mDNS configured as 'Operational device'.
[1665412421.548401][7:7] CHIP:DIS: Broadcasting mDns reply for query from 172.17.0.1
[1665412421.742397][7:7] CHIP:DIS: Broadcasting mDns reply for query from 100.73.208.71
[1665412421.743889][7:7] CHIP:DIS: Broadcasting mDns reply for query from 192.168.0.12
[1665412421.745161][7:7] CHIP:DIS: Broadcasting mDns reply for query from fd00::1
[1665412421.746687][7:7] CHIP:DIS: Broadcasting mDns reply for query from fe80::42:b1ff:fe15:f653
[1665412421.747751][7:7] CHIP:DIS: Broadcasting mDns reply for query from fe80::1
[1665412421.913650][7:7] CHIP:DIS: Broadcasting mDns reply for query from fd7a:115c:a1e0:ab12:4843:cd96:6249:d047
[1665412421.923632][7:7] CHIP:DIS: Broadcasting mDns reply for query from fe80::82de:b23d:7bb3:9fbe
[1665412421.925315][7:7] CHIP:DIS: Broadcasting mDns reply for query from fda6:b25b:1fe5:c58f:81d:2116:1d69:147f
[1665412421.926690][7:7] CHIP:DIS: Broadcasting mDns reply for query from fe80::7bf4:29ac:6c11:229b
[1665412421.927497][7:7] CHIP:DIS: mDNS service published: _matter._tcp
[1665412421.927511][7:7] CHIP:DMG: Command handler moving to [ Preparing]
[1665412421.927516][7:7] CHIP:DMG: Command handler moving to [AddingComm]
[1665412421.927520][7:7] CHIP:DMG: Command handler moving to [AddedComma]
[1665412421.927522][7:7] CHIP:ZCL: OpCreds: UpdateNOC successful.
[1665412421.927526][7:7] CHIP:IN: Expiring all sessions for fabric 0x1!!
[1665412421.927528][7:7] CHIP:IN: SecureSession[0x56000755cd10]: MarkForEviction Type:2 LSID:26510
[1665412421.927530][7:7] CHIP:SC: SecureSession[0x56000755cd10]: Moving from state 'kActive' --> 'kPendingEviction'
[1665412421.927536][7:7] CHIP:DMG: Time out! failed to receive status response from Exchange: 2397i
[1665412421.927538][7:7] CHIP:DMG: <RE> OnReportConfirm: NumReports = 0
[1665412421.927540][7:7] CHIP:DMG: IM RH moving to [AwaitingDestruction]
[1665412421.927548][7:7] CHIP:IN: SecureSession[0x56000755cd10]: Released - Type:2 LSID:26510
[1665412421.927551][7:7] CHIP:IN: SecureSession[0x560007572d60]: MarkForEviction Type:2 LSID:26511
[1665412421.927552][7:7] CHIP:SC: SecureSession[0x560007572d60]: Moving from state 'kActive' --> 'kPendingEviction'
[1665412421.927564][7:7] CHIP:DMG: Decreasing reference count for CommandHandler, remaining 0
[1665412421.927568][7:7] CHIP:EM: Piggybacking Ack for MessageCounter:11273573 on exchange: 28700r
[1665412421.927587][7:7] CHIP:IN: Prepared secure message 0x56000758da58 to 0x61EB9EBEBDC5D118 (1)  of type 0x9 and protocolId (0, 1) on exchange 28700r with MessageCounter:57784987.
[1665412421.927593][7:7] CHIP:IN: Sending encrypted msg 0x56000758da58 with MessageCounter:57784987 to 0x61EB9EBEBDC5D118 (1) at monotonic time: 00000000402FC7AC msec
[1665412421.927626][7:7] CHIP:DMG: Command handler moving to [CommandSen]
[1665412421.927627][7:7] CHIP:DMG: Command handler moving to [AwaitingDe]
[1665412422.265512][7:7] CHIP:EM: Retransmitting MessageCounter:57784987 on exchange 28700r Send Cnt 1
[1665412422.265538][7:7] CHIP:IN: Sending encrypted msg 0x56000758da58 with MessageCounter:57784987 to 0x61EB9EBEBDC5D118 (1) at monotonic time: 00000000402FC8FE msec
[1665412422.662481][7:7] CHIP:EM: Retransmitting MessageCounter:57784987 on exchange 28700r Send Cnt 2
[1665412422.662510][7:7] CHIP:IN: Sending encrypted msg 0x56000758da58 with MessageCounter:57784987 to 0x61EB9EBEBDC5D118 (1) at monotonic time: 00000000402FCA8B msec
[1665412423.212252][7:7] CHIP:EM: Retransmitting MessageCounter:57784987 on exchange 28700r Send Cnt 3
[1665412423.212281][7:7] CHIP:IN: Sending encrypted msg 0x56000758da58 with MessageCounter:57784987 to 0x61EB9EBEBDC5D118 (1) at monotonic time: 00000000402FCCB1 msec
[1665412424.234462][7:7] CHIP:EM: Retransmitting MessageCounter:57784987 on exchange 28700r Send Cnt 4
[1665412424.234487][7:7] CHIP:IN: Sending encrypted msg 0x56000758da58 with MessageCounter:57784987 to 0x61EB9EBEBDC5D118 (1) at monotonic time: 00000000402FD0AF msec
[1665412425.713124][7:7] CHIP:EM: Failed to Send CHIP MessageCounter:57784987 on exchange 28700r sendCount: 4 max retries: 4
[1665412425.713139][7:7] CHIP:IN: SecureSession[0x560007572d60]: MarkAsDefunct Type:2 LSID:26511
[1665412425.713140][7:7] CHIP:SPT: VerifyOrDie failure at ../../examples/all-clusters-app/linux/third_party/connectedhomeip/src/transport/SecureSession.cpp:125: false

Bug prevalence

Will likely always occur with an MRP timeout closely following after UpdateNOC

GitHub hash of the SDK that was being used

fa8db56

Platform

other

Platform Version(s)

No response

Anything else?

No response

@msandstedt
Copy link
Contributor Author

msandstedt commented Oct 10, 2022

The feature that appears to be breaking is #19780:

commit 03de70064324757db568f352471b214a173cc6ad
Author: Boris Zbarsky <[email protected]>
Date:   Wed Jun 22 19:19:30 2022 -0400

    Fix crashes when closing all exchanges for fabric. (#19780)
    
    * Fix crashes when closing all exchanges for fabric.
    
    Our "close all exchanges except this special one for this fabric" API
    messes up exchange refcounting, leading to use-after-free.
    
    The fix is to reuse, as much as possible, the normal "session is going
    away" flow to notify exchanges, and other session consumers, that the
    sessions are in fact going away.
    
    Fixes https://github.com/project-chip/connectedhomeip/issues/19747
    
    * Address review comments.
    
    * Updates to fix fallout from #19502.
    
    We need to allow messages on inactive sessions to reach the exchange manager,
    because such sessions need to be able to deliver an MRP ack to an exchange
    waiting for one.
    
    We also don't want to crash on an attempt to transition from Inactive to Defunct
    state; the transition should just be ignored.  This way if we start trying to
    transitionin to Defunct on MRP delivery failures we will not start crashing if
    such a failure happens on an Inactive session.
    
    * Address review comment
    
    * Address review comments
    
    * Address Jerry's review comments.

Somehow, it appears there is a bad interaction with #19502, even though we intended for the two features to work correctly together.

@bzbarsky-apple
Copy link
Contributor

So in this case we have session 26511 which being used by exchange "28700r", which is the exchange the UpdateNOC is delivered on.

The sequence of events seems to be the following:

  1. UpdateNOC command is received.
  2. The session is marked as expired, but the exchange continues to use it until it sends back the response. This is the one situation in which an exchange can be using an expired session.
  3. When the response send fails in MRP, we try to mark the session as defunct, and hit the VerifyOrDie in the "this session is actually evicted, you are trying to revive it" case.

The right answer is either to remove the VerifyOrDie (and just ignore MarkAsDefunct in the evicted state), or to add checks at the MarkAsDefunct callsites in exchange manager and MRP that the session is not in the evicted state.

@mrjerryjohns

@bzbarsky-apple
Copy link
Contributor

Honestly, I would prefer the "remove the VerifyOrDie" approach.

@mrjerryjohns
Copy link
Contributor

I'm fine with the ignoring the VerifyOrDie call.

@msandstedt
Copy link
Contributor Author

Honestly, I would prefer the "remove the VerifyOrDie" approach.

Makes sense to me as well.

@msandstedt
Copy link
Contributor Author

CC @gharveymn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants