-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ch3: TCP ports are always bound to INADDR_ANY #6010
Comments
What is your use case that this is an issue? |
Security port scans appear to crash an app using MPICH on our cluster ocassionally |
Try this patch -- #5900 -- and see if it fixes the assertion error. That patch only prevents such assertion error in hydra. Last time I checked, I didn't encounter the issue with ch3:nemesis, but I can see how similar issue exists in the netmod. Could you attach a crash log? The solution will just add some basic measures to prevent network port scans interrupting the jobs. Will that be sufficient? |
Unfortunately, we are getting assertions in ch3:nemesis much more often than in hydra: I made simple hotfix by replacing failing assert path with "*got_sc_eof = 1; goto fn_exit;" |
OK, I'll investigate and see to add some basic checks for ch3. Note that device ch3 are legacy device and only will receive minimum maintenance. If deploying new MPICH is an option, we strongly recommend using the ch4 device.
Hehe, |
Can you give me some pointer to "best" or standard practice, if MPI is the wrong layer ? |
I would suggest preventing accessibility to your cluster from external internet altogether. You can launch your jobs using a login node or launch node. |
Acessibility from external internet is the easy part. We would like to allow our (many) users to connect |
I see. This is a good conversation. The next layer of security is to control the specific port range to be used. You can use |
TCP ports are always bound to INADDR_ANY (open to Internet)
even when user asks for specific interface or address (like localhost)
via MPIR_CVAR_NEMESIS_TCP_NETWORK_IFACE or MPIR_CVAR_CH3_INTERFACE_HOSTNAME.
Connection attempt from any external entity can trigger an assert
easily (e.g. in recv_id_or_tmpvc_info()) , there is absolutely no authentication
involved.
The text was updated successfully, but these errors were encountered: