Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEATH LOOP: when using mesh with isbase:true and multiple pins! #97

Open
jeromevalentin opened this issue Aug 24, 2017 · 17 comments
Open

Comments

@jeromevalentin
Copy link

With versions:

    "seneca": "3.4.2",
    "seneca-balance-client": "0.6.1",
    "seneca-mesh": "0.11.0"

The following code

// seneca standard init
// declaring the mesh plugin to join the party!
seneca.use('mesh', {
  isbase: true,
  listen: [
    { pin: 'role:A' },
    { pin: 'role:B' }
  ]
})

leads to the following error:

Error: [DEATH LOOP]
at Seneca.die
at Object.intern.handle_reply 
at Object.act_tm [as ontm] 
at Timeout.timeout_check [as _onTimeout] 
at ontimeout (timers.js:488:11)
at tryOnTimeout (timers.js:323:5)
at Timer.listOnTimeout (timers.js:283:5)
@pola88
Copy link

pola88 commented Aug 30, 2017

Any luck with this?? I have the same problem

@blanchma
Copy link

@rjrodger we have the same problem but we don't have isBase and multiple pins in the same place. Our base is simpler:

seneca.use('mesh', {  isbase: true,
                      auto: true,
                      port: process.env.SENECA_PORT || 7000,
                      bases: seneca_bases
                   });

Our current packages are seneca: 3.4.2, seneca-balance-client: 0.6.1 and seneca-mesh: 0.11.0

@rodmaz
Copy link

rodmaz commented Sep 9, 2017

Are you using Docker containers? This seems like a networking issue we are having here when using Docker containers.

@jeromevalentin
Copy link
Author

I'm not using docker, just starts seneca with isbase true and multiple pins in listen options leads to the DEATH LOOP

@rodmaz
Copy link

rodmaz commented Sep 10, 2017

@jeromevalentin Do you have other bases listed in seneca_bases?
Does the error occur when you remove bases: seneca_bases?
Usually these DEATH_LOOPs occur when your client cannot communicate with your base due to networking limitations (no multicast support, ports blocked etc).

@blanchma
Copy link

blanchma commented Sep 10, 2017 via email

@jeromevalentin
Copy link
Author

jeromevalentin commented Sep 10, 2017

@rodmaz No other bases, no other mesh process is running, just trying to start a 'base' mesh process supporting multiple microservices.

In testmesh.es6, write

import Seneca from 'seneca'

let seneca = Seneca({ tag: 'mesh' })

seneca.add({ role: 'A' }, (args, done) => done(null, { A: args }))
seneca.add({ role: 'B' }, (args, done) => done(null, { B: args }))

seneca.use('mesh', {
  isbase: true,
  listen: [
    { pin: 'role:A' },
    { pin: 'role:B' }
  ]
})
seneca.ready(() => {
  console.log('Ready')
  seneca.act({ role: 'A', param: 'foobar' }, (err, resp) => console.log(err, resp))
})

then transpiles it with babel, and execute:

$> node testmesh.js
{"code":"EADDRINUSE","errno":"EADDRINUSE","syscall":"bind","address":"0.0.0.0","port":39999,"level":"warn","actid":"qoem7dimej4c/0bsvrvuozh8q","plugin_name":"mesh","pattern":"init:mesh","seneca":"l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","when":1505052018525}
{"notice":"seneca: Action name:mesh,plugin:define,role:seneca,seq:2,tag:undefined failed: [TIMEOUT].","code":"act_execute","err":{"eraro":true,"orig":{},"code":"act_execute","seneca":true,"package":"seneca","msg":"seneca: Action name:mesh,plugin:define,role:seneca,seq:2,tag:undefined failed: [TIMEOUT].","details":{"message":"[TIMEOUT]","pattern":"name:mesh,plugin:define,role:seneca,seq:2,tag:undefined","instance":"Seneca/l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","orig$":{},"message$":"[TIMEOUT]","plugin":{}},"callpoint":"at Object.act_tm [as ontm] (/home/jvalentin/project/node_modules/seneca/seneca.js:917:52)"},"actid":"qoem7dimej4c/0bsvrvuozh8q","msg":{"role":"seneca","plugin":"define","name":"mesh","seq":2,"default$":{},"fatal$":true,"local$":true,"plugin$":{"name":"transport","tag":"-","fullname":"transport"}},"meta":{"start":1505052009975,"end":1505052032227,"pattern":"name:mesh,plugin:define,role:seneca,seq:2,tag:undefined","action":"plugin_definition_10","mi":"qoem7dimej4c","tx":"0bsvrvuozh8q","id":"qoem7dimej4c/0bsvrvuozh8q","instance":"l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","tag":"mesh","seneca":"3.4.2","version":"0.1.0","gate":false,"fatal":true,"local":true,"timeout":22222,"dflt":{},"plugin":{"name":"transport","tag":"-","fullname":"transport"},"parents":[],"sync":false,"trace":[{"desc":["name:balance_client,plugin:define,role:seneca,seq:3,tag:mesh~mluctf","ejmlatdu9ll4/0bsvrvuozh8q","l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","mesh","0.1.0",1505052010001,1505052010005,false,"plugin_definition_23"],"trace":[{"desc":[null,"lyxzq82zjc5r/0bsvrvuozh8q","l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","mesh","0.1.0",1505052010004,1505052010005,true,null],"trace":[]}]}],"sub":null,"data":null,"err":null,"err_trace":null,"error":true,"empty":null},"actdef":{"plugin_name":"mesh","plugin_tag":"-","plugin_fullname":"mesh","raw":{"role":"seneca","plugin":"define","name":"mesh","seq":2},"plugin":{"name":"mesh","tag":"-","fullname":"mesh"},"sub":false,"client":false,"rules":{},"id":"plugin_definition_10","name":"plugin_definition","pattern":"name:mesh,plugin:define,role:seneca,seq:2,tag:undefined","msgcanon":{"name":"mesh","plugin":"define","role":"seneca","seq":2,"tag":"undefined"},"priorpath":""},"client":false,"listen":false,"transport":{},"kind":"act","case":"ERR","duration":22252,"level":"error","plugin_name":"mesh","pattern":"name:mesh,plugin:define,role:seneca,seq:2,tag:undefined","seneca":"l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","when":1505052032230}
[]
{"notice":"seneca: Action init:mesh failed: [TIMEOUT].","code":"act_execute","err":{"eraro":true,"orig":{},"code":"act_execute","seneca":true,"package":"seneca","msg":"seneca: Action init:mesh failed: [TIMEOUT].","details":{"message":"[TIMEOUT]","pattern":"init:mesh","instance":"Seneca/l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","orig$":{},"message$":"[TIMEOUT]","plugin":{}},"callpoint":"at Object.act_tm [as ontm] (/home/jvalentin/project/node_modules/seneca/seneca.js:917:52)"},"actid":"qoem7dimej4c/0bsvrvuozh8q","msg":{"init":"mesh","default$":{},"fatal$":true,"local$":true,"plugin$":{"name":"transport","tag":"-","fullname":"transport"},"tx$":"0bsvrvuozh8q"},"meta":{"start":1505052010002,"end":1505052032351,"pattern":"init:mesh","action":"init_24","mi":"8qivehq1pvt2","tx":"0bsvrvuozh8q","id":"8qivehq1pvt2/0bsvrvuozh8q","instance":"l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","tag":"mesh","seneca":"3.4.2","version":"0.1.0","gate":false,"fatal":true,"local":true,"timeout":22222,"dflt":{},"plugin":{"name":"transport","tag":"-","fullname":"transport"},"parents":[["name:mesh,plugin:define,role:seneca,seq:2,tag:undefined","qoem7dimej4c/0bsvrvuozh8q","l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","mesh","0.1.0",1505052009975,null,false,"plugin_definition_10"]],"sync":true,"trace":[{"desc":["cmd:listen,role:transport","tnuj8c3xueul/0bsvrvuozh8q","l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","mesh","0.1.0",1505052010453,1505052010462,true,"_29"],"trace":[{"desc":["cmd:listen,role:transport","mjmlnzz4kxgd/0bsvrvuozh8q","l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","mesh","0.1.0",1505052010454,1505052010457,true,"_12"],"trace":[{"desc":["hook:listen,role:transport,type:web","5eptqk276xj0/0bsvrvuozh8q","l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","mesh","0.1.0",1505052010455,1505052010457,true,"_16"],"trace":[]},{"desc":["hook:listen,role:transport,type:web","5eptqk276xj0/0bsvrvuozh8q","l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","mesh","0.1.0",1505052010455,1505052010457,true,"_16"],"trace":[]}]}]}],"sub":null,"data":null,"err":null,"err_trace":null,"error":true,"empty":null},"actdef":{"plugin_name":"mesh","plugin_tag":"-","plugin_fullname":"mesh","raw":{"init":"mesh"},"plugin":{"name":"mesh","tag":"-","fullname":"mesh"},"sub":false,"client":false,"rules":{},"id":"init_24","name":"init","pattern":"init:mesh","msgcanon":{"init":"mesh"},"priorpath":""},"client":false,"listen":false,"transport":{},"kind":"act","case":"ERR","duration":22349,"level":"error","plugin_name":"mesh","pattern":"init:mesh","seneca":"l1c2jbmuhpch/1505052009412/3380/3.4.2/mesh","when":1505052032352}
...

The error lets thinking mesh is trying to start the base listening on 39999 multiple times.

@rodmaz
Copy link

rodmaz commented Sep 10, 2017

@jeromevalentin This error (EADDRINUSE is clear), you have other process running on port 39999. Try to reboot your machine or investigate which process is using that port. A quick test you can do is use another port number.

@jeromevalentin
Copy link
Author

jeromevalentin commented Sep 10, 2017

@rodmaz The error is clear, but no other process is using that port on my computer:

$> netstat -a | grep 39999
$>

@karn09
Copy link

karn09 commented Sep 19, 2017

I was able to get this working after seeing similar issues.

Below code throws death loop error:

    .use("mesh", {
        isbase: true,
        listen: [{ pin: 'users:list' }, { pin: 'users:create' }]
    });

Instead, I used- which worked:

    .use("mesh", {
        isbase: true,
        pin: "users:*"
    });

Also, I found myself camelCasing isBase: true, it should be isbase: true. Using isBase, will cause the death loop error as well.

@jeromevalentin
Copy link
Author

jeromevalentin commented Sep 20, 2017

I agree, like this, it works ... but this is just a workaround which is not always applicable.
In my previous example, it would require to use:

{
  isbase: true,
  pin: 'role:*'
}

and with such configuration, that process will receive all role requests .... while it is only able to address A & B So this is not a solution

@jeromevalentin jeromevalentin changed the title DEATH LOOP: when using mesh with isBase:true and multiple pins! DEATH LOOP: when using mesh with isbase:true and multiple pins! Sep 20, 2017
@karn09
Copy link

karn09 commented Sep 20, 2017

Good point. It appears the documentation may be wrong.

After reading over the code, I found this piece which iterates over the listen array, but does not appear to follow the shape of the object within the docs:

starting at line 171:

      function init() {
...
...
          _.each(listen, function(listen_opts) {
            if (options.host && null == listen_opts.host) {
              listen_opts.host = options.host
            }

            if ('@' === (listen_opts.host && listen_opts.host[0])) {
              listen_opts.host = rif(listen_opts.host.substring(1))
            }

            listen_opts.port = null != listen_opts.port
              ? listen_opts.port
              : function() {
                  return 50000 + Math.floor(10000 * Math.random())
                }

            listen_opts.model = listen_opts.model || 'consume'

            listen_opts.ismesh = true

            seneca.listen(listen_opts)
          })
...

But listen is defined within the outer scope on line 124 - which matches the docs:

    var listen = options.listen || [
      { pin: pin, model: options.model || 'consume' }
    ]

Fortunately, there is an undocumented pins option, which appears to work for the most part.

    .use("mesh", {
        isbase: true,
        pins: [{
          users: 'list'
        }, {
          users: 'create'
        }],
        monitor: true
    });

However, if I use this configuration and try to call a service that does not exist, the mesh client crashes, for example

// client.js
const seneca = Seneca().use('mesh');
seneca.act({ users: 'nada' }, (err, result) => {
       // do stuff
})

// server.js
const seneca = Seneca().add('usersPlugin').use('mesh', { isbase: true })

results in:

[1] {"err":{},"level":"warn","seneca":"gqzge20hstfa/1505946828319/2923/3.4.2/-","when":1505946845630}
[1] Debug: internal, implementation, error
[1]     TypeError: Uncaught error: Cannot convert undefined or null to object
[1]     at toString (<anonymous>)
[1]     at objectToString (internal/util.js:18:36)
[1]     at Object.isError (internal/util.js:14:10)
[1]     at Object.act_fn [as fn] (/Users/johnnieves/workspace/winwin/foundation-node-postgres/node_modules/seneca/seneca.js:910:25)
[1]     at Immediate.processor [as _onImmediate] (/Users/johnnieves/workspace/winwin/foundation-node-postgres/node_modules/gate-executor/gate-executor.js:136:14)
[1]     at runCallback (timers.js:781:20)
[1]     at tryOnImmediate (timers.js:743:5)
[1]     at processImmediate [as _immediateCallback] (timers.js:714:5)
[1] 170920/223405.617, [response,api,users] http://jniev.home:5000: post /users {} 500 (6259ms)
[1] Error: async hook stack has become corrupted (actual: 1367, expected: 351)
[1]  1:
[1] node::AsyncWrap::MakeCallback(v8::Local<v8::Function>, int, v8::Local<v8::Value>*) [/usr/local/bin/node]
[1]  2:
[1] node::(anonymous namespace)::TimerWrap::OnTimeout(uv_timer_s*) [/usr/local/bin/node]
[1]  3:
[1] uv__run_timers [/usr/local/bin/node]
[1]  4:
[1] uv_run [/usr/local/bin/node]
[1]  5:
[1] node::Start(v8::Isolate*, node::IsolateData*, int, char const* const*, int, char const* const*) [/usr/local/bin/node]
[1]  6:
[1] node::Start(uv_loop_s*, int, char const* const*, int, char const* const*) [/usr/local/bin/node]
[1]  7:
[1] node::Start(int, char**) [/usr/local/bin/node]
[1]  8:
[1] start [/usr/local/bin/node]
[1] [nodemon] app crashed - waiting for file changes before starting...

Perhaps this is because I'm using this with Hapi/Chairo, but definitely did not expect this to take down my entire API server if a microservice is found to be unavailable.

@beverlycodes
Copy link

This might belong as a separate issue, but it seems like a DEATH LOOP should be something that can be voluntarily handled by a service. My services do more than just Seneca call-and-response, and should be allowed to remain running even if seneca-mesh is having a bad time. I'd rather be able to periodically retry finding and joining the mesh than have to constantly restart my service on a DEATH LOOP.

@blanchma
Copy link

blanchma commented Dec 7, 2017

Agree with @RyanFields. Death Loop should be an error code different from act_execute and handleable

@cwilso03
Copy link

cwilso03 commented Jan 25, 2018

Building off of @karn09's post above:

I was recently having this problem too in an app, and finally figured out that the seneca-mesh pins option only works correctly if you give it an array of jsonic strings, not objects. So, the following works (whether on base node or not):

.use("mesh", {
        isbase: true,
        pins: ["users:list", "users:create"],
    });

but the following does not:

.use("mesh", {
        isbase: true,
        pins: [{
          users: 'list'
        }, {
          users: 'create'
        }]
    });

I put together a stripped down test project that uses seneca-web and seneca-mesh, with 4 stupid-simple microservices, two hosted on the base node and two hosted on a separate mesh node. The attached zip has the source and a short README.md that describes how to use it:

seneca-test.zip

@danielo515
Copy link

to me the error vanish if I set the isbase to lower case. If I remove the isbase then the error comes out, but that may be due to being one single node without any other base around. In any case, I can confirm that defining the pin as a string and using isbase all lower case it works:

const initialSenecaConfig = {
  auto: true,
  isbase:true,
  listen: [
    { pin: "role:profile,command:*", model: "consume" }
  ],
  discover: {
    rediscover: true,
    custom: {
      active: true,
      find: dnsSeed
    }
  }
};

@danielo515
Copy link

The error reported by @jeromevalentin about the EADDRINUSE is real. If you try to listen to several pines, using the format specified by the docs:

listen [
  {  pin: {role:'stuff', cmd:'a'} }
  {  pin: {role:'stuff', cmd:'b'} }
]

Then seneca-mesh tries to start several times, hence the error of EADDRINUSE. Removing one of the "pins" from the list the error dissapears, so it is not another process listenning on that port, is exactly the same process trying to listen several times on the same port.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants