Operational troubleshooting

Home > Operational troubleshooting

Frequent operational problems and errors.

All components

Extremely slow performance on framework components.

Problem usually is that the MySQL engine is trying to make a reverse lookup for each and every external connection (especially for external MySQL servers), when for some reason DNSs are down or not able to do the reverse lookups.

To solve this issue, first check for appropiate DNSes under /etc/resolv.conf; primary and secondary.

In addition, MySQL engine can be configured to avoid reverse lookups (it can increase a little bit performance on external MySQL queries). Adding the following at the end of /etc/mysql/my.cfg:

#Skip name resolve
skip-name-resolve

Remember that this change will only take effect after MySQL engine restarts.

Expedient

(v0.6) I get this error when creating a Spirent VM: 'Action create on VM *** failed: : HD type not yet supported by XEN agent'

Spirent VM template is not distributed yet in this version. Once that is done, this error won't appear anymore.

Stalled VMs in Expedient and/or in VT AM.

Stalled VMs can be identified by an endless "loading" animation in the status cell.

Step 1

In case the VM was stalled after being created, open a terminal in the server where the VM was created and search for the folder where the physical files for the VM are located:

cd /opt/ofelia/oxa
find cache/vms/ -name "<vm_name>.conf" && find remote/vms/ -name "<vm_name>.conf"

If there are different machines with the same name in different folders:
- enter each VM configuration file and check the uuid (<vm_uuid>) of the VM:
```
vim <find_path_N>/<vm_name>.conf
```
- then search for the VM uuid in the VT AM that corresponds to <vm_uuid>. Open a terminal in the OCF machine and type the following:
```
cd /opt/ofelia/vt_manager/src/python/vt_manager/
python manage.py shell
>>> from models import VirtualMachine
>>> VirtualMachine.objects.get(uuid="<vm_uuid>")
```
If there is only one resulting folder after the search -- or you know the VM uuid and have checked that it corresponds to the data inside the <vm_name>.conf file, remove the 3 physical files for the VM:
```
cd <find_path>
rm <vm_name>.conf <vm_name>.img <vm_name>_swap.img
```

Step 2

If you find VMs that are stalled at VT AM, get the ID for the VM and delete it:

Go to the server page and scroll down to the VM list. Right click on the VM name and select "Inspect Element"
A frame with HTML code will appear. Write down the number at the code, which would be something similar to: id="tr_vm1299"

Open a terminal in the OCF machine and type the following:

cd /opt/ofelia/vt_manager/src/python/vt_manager/tests/
python deleteVM.py <vm_id>

Step 3

If you find VMs that are stalled at Expedient, get the ID for the Expedient cached VM and delete it:

Go to slice detail page and scroll down to the VM list. Right click on the VM name and select "Inspect Element"
A frame with HTML code will appear. Write down the number at the code, which would be something similar to: id="tr_vm1299"

Open a terminal in the OCF machine and type the following:

cd /opt/ofelia/expedient/src/python/vt_plugin/tests/             # OCF < 0.5
cd /opt/ofelia/expedient/src/python/plugins/vt_plugin/tests/     # OCF = 0.5
python deleteVM.py <vm_id>

Step 4

After deleting the VM it is time to free the associated addresses to its interface(s).

Look for the ranges sections within the VT AM GUI and write down each IP and MAC address related to the VM you just deleted

Open a terminal in the OCF machine and type the following:

cd /opt/ofelia/vt_manager/src/python/vt_manager/
python manage.py shell
>>> from models import Ip4Slot
>>> Ip4Slot.objects.get(ip="<vm_ip>").delete()
>>> from models import MacSlot
>>> MacSlot.objects.get(mac="<vm_mac_i>").delete() # Repeat N times (N = #MACs(VM))

(v0.5) I have just installed OCF and see that the log error shows that some tables do not exist.

If you got an error similar to this:

DatabaseError: (1146, "Table 'expedient.vt_plugin_xmlrpcserverproxy' doesn't exist")

it may be that the models from the plugins are not being properly synchronized when the manage.py syncdb command is used during the installation.

To solve it type the following:

# uncomment lines no. 182, 188, 189, 190, that is:
# ('openflow.plugin', 'vt_plugin', 'vt_plugin.communication', 'openflow.dummyom')
vim /opt/ofelia/expedient/src/python/expedient/clearinghouse/defaultsettings/django.py
service apache2 restart
python manage.py syncdb

(v0.3) I get this error: 'AttributeError: 'module' object has no attribute 'XMLField' before getting a fatal error.'

The XMLField class in Django has been deprecated as of version 1.3. Please install Django 1.2.7 as follows:

gpg --keyserver pgp.mit.edu --recv-key 0x8C8B2AE1
gpg --verify Django-1.2.7.checksum.txt
wget http://www.djangoproject.com/m/releases/1.2/Django-1.2.7.tar.gz
tar xzvf Django-1.2.7.tar.gz
cd Django-1.2.7/
python setup.py install

More info at https://docs.djangoproject.com/en/dev/topics/install/

(v0.3) When trying to install I get the following: 'ImportError: No module named pypelib.Ruletable'

If it's the first time you install OCF this means that you probably do not have the pyPElib library. To overcome this please execute the following code in a shell:

/usr/bin/apt-get -y install python-pyparsing
/usr/bin/wget http://pypelib.googlecode.com/files/pypelib_latest_all.deb
/usr/bin/dpkg -i pypelib_latest_all.deb
rm pypelib_latest_all.deb

I try to add an Openflow Aggregate Manager and get this error: 'user X is not a clearinghouse user'.

That means that user 'X' was not set in the clearinghouse. Please take a look at the Configuring connection with Expedient section at ofam-configuration.

python-pyparsing dependency not showing

To solve the not found dependency with the python-pyparsing library, it should be installed before running OFVER script using apt-get install python-pyparsing, also making sure that your default Python path points to python-2.6.

Optin Manager

(v0.3) I get this error: 'django.core.exceptions.ImproperlyConfigured: settings.DATABASES is improperly configured. Please supply the ENGINE value. Check settings documentation for more details.'

See that the configuration file at optin_manager/src/python/openflow/optin_manager/localsettings.py is properly configured. If it already is, it may be that your Python version does not have the pypelib module. Please make a symbolic link to the pypelib library from your current version (e.g. 2.7) to the 2.6:

ln -s /usr/lib/python2.6/pypelib/ /usr/lib/python2.7/pypelib

Requests from the Web UI do not appear as `Requests`. Where are they?

No, they don't. Requests made through the Expedient plugin can be found in the menu Administrate Flowspace->Add rule of the Optin Manager Web UI.

VT AM

(v0.3) I get a 'SyntaxError' in a settings file

This is due to an uncommented line in the file vt_manager/src/python/vt_manager/mySettings.py. Please make sure the following line is commented:

ES_DIR:  [networking, policyEngine, users, ...] in SRC_DIR/python/vt_manager/views/templates/theme_name as needed.

When trying to access the GUI I find this in the Apache VM AM's log: 'ImportError: No module named pypelib.persistence.backends.django'

This happens because the pyPElib library is not installed for your default Python version. To correct this you may check your Python version with python -V and then create a symbolic link from a subfolder here to the pyPElib library. For example, if you use Python 2.7 and pyPElib is installed on Python 2.6's folder:

ln -s /usr/lib/python2.6/pypelib/ /usr/lib/python2.7/pypelib

Error 111: Connection refused

If your VMs are not being created and your VT AM log shows this:

XMLRPC Client error: can't connect to method send at https://***:9229/ [Errno 111] Connection refused

then make sure that the server in which you try to create your VM has its agent up and running:

ps aux | grep "OfeliaAgent" | grep -v "grep"

and if it is not, start it with

service oxad start

VMs appear in the VM AM GUI and are being created, but state in the Fronted is not updated

The communication between VM AM and Expedient (and also between VM AM and agent) is fully asynchronous. Make sure there are no firewall rules between these three components and that VTAM_IP, VTAM_PORT settings in mySettings.py file are correctly set.

Have a look to the Manuals for more details on configuration.

##OXA (Ofelia XEN Agent) and XEN server

Error 39: Directory not empty

If users experience this error during VM creation:

Action create on VM test failed: : [Errno 39] Directory not empty: '/tmp/oxa/hdtest_3382/'

It is most likely due to a wrong configuration of the server, specially /etc/modules file. Please, revise XEN installation manual and note that loop module must have max_loop=64 (default value) or higher.

root@node04:~# cat /etc/modules 
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
# Parameters can be specified after the module name.

loop max_loop=64
8021q

Remember that changes in /etc/modules will not take effect until you reboot the system.

Cannot create VMs on server

If Expedient shows something similar to:

Action create on VM <X> failed: : Could not clone image to working directory project:<Y>, slice:<Z>, name:<X>

if Agent error log shows something similar to:

Error: Device 51713 (vbd) could not be connected. Failed to find an unused loop device

this is because the total number of loop devices (/dev/loop* files) is being used. More info here and here.

You may choose between:

Stopping started, not needed VMs at Xen to free them
Increasing the max_loop value (default=64) to allow more VMs running at a time. Refer to this section.

Error 113: No route to host

If your OXA log (/opt/ofelia/oxa/log/error.log) complains about a a host it cannot route to:

error: [Errno 113] No route to host

then check that the variables VTAM_IP, VTAM_PORT, XMLRPC_USER, XMLRPC_PASS at the file /opt/ofelia/oxa/repository/vt_manager/src/python/agent/mySettings.py are correct. You may test that those are correct by pinging the VT AM from the OXA:

~# python
>>> import xmlrpclib
>>> server = xmlrpclib.Server("https://<XMLRPC_USER>:<XMLRPC_PASS>@<VTAM_IP>:<VTAM_PORT>/xmlrpc/plugin")
>>> server.ping("test")
'test'

XEN server networking: interconnection between two VMs in the same server through data-path is possible, without any OF rule.

Although not desired, this is known limitation of the current XEN server network configuration. This is due to the fact that XEN bridges(one per physical interface) are shared among the VMs and are normal Linux bridges (learning switches).

There are plans to deploy openvswitch to do some l2 filtering / enable OF in those bridges, as well as to prevent spoofing, but this is still under discussion inside of OFELIA.

Error: shutdown() takes exactly 0 arguments (1 given)

If you detect an error similar to TypeError: shutdown() takes exactly 0 arguments (1 given), then you are probably using Python2.7. You may need to fix this known bug in the werkzeug library. For that, open the file /usr/lib/python2.7/SocketServer.py and add the following under the shutdown_request method of the TCPServer class (line ~465):

try:
  request.shutdown(socket.SHUT_WR)
except socket.error:
  pass
except TypeError: # << add this
  request.shutdown() # << add this

Miscellaneous

(v0.8) Apache2 processes block host machine

It is possible that the host machine processes big amounts of data bursts, which are sent in the context of the periodic monitoring data exchange between islands. As a direct effect, Apache2 processes produce high peaks of CPU and virtual memory consumption. Limiting the latter is the solution to this problem.

For that, go to the /etc/default/apache2 and add the following:

# Set maximum virtual memory to your preferred size (in bytes)
# We recommend around 50% of your host's memory (e.g. 1Gb)
ulimit -v 1048576

If this did not work, use xm destroy and then xm create <vm_config_file> on the affected machine.

Overview
Experimenting
Administering
- Installing
- Upgrading
- Configuration
  - Components
    - Expedient
    - OF AM
    - VT AM
  - Infrastructure
    - XEN server
- Troubleshooting
- Theme manager
Contributing
- Developing
- Reporting
  - Issue tracker and Roadmap

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Operational troubleshooting

All components

Extremely slow performance on framework components.

Expedient

(v0.6) I get this error when creating a Spirent VM: 'Action create on VM *** failed: : HD type not yet supported by XEN agent'

Stalled VMs in Expedient and/or in VT AM.

(v0.5) I have just installed OCF and see that the log error shows that some tables do not exist.

(v0.3) I get this error: 'AttributeError: 'module' object has no attribute 'XMLField' before getting a fatal error.'

(v0.3) When trying to install I get the following: 'ImportError: No module named pypelib.Ruletable'

I try to add an Openflow Aggregate Manager and get this error: 'user X is not a clearinghouse user'.

python-pyparsing dependency not showing

Optin Manager

(v0.3) I get this error: 'django.core.exceptions.ImproperlyConfigured: settings.DATABASES is improperly configured. Please supply the ENGINE value. Check settings documentation for more details.'

Requests from the Web UI do not appear as `Requests`. Where are they?

VT AM

(v0.3) I get a 'SyntaxError' in a settings file

When trying to access the GUI I find this in the Apache VM AM's log: 'ImportError: No module named pypelib.persistence.backends.django'

Error 111: Connection refused

VMs appear in the VM AM GUI and are being created, but state in the Fronted is not updated

Error 39: Directory not empty

Cannot create VMs on server

Error 113: No route to host

XEN server networking: interconnection between two VMs in the same server through data-path is possible, without any OF rule.

Error: shutdown() takes exactly 0 arguments (1 given)

Miscellaneous

(v0.8) Apache2 processes block host machine

Clone this wiki locally

Operational troubleshooting

All components

Extremely slow performance on framework components.

Expedient

(v0.6) I get this error when creating a Spirent VM: 'Action create on VM *** failed: : HD type not yet supported by XEN agent'

Stalled VMs in Expedient and/or in VT AM.

(v0.5) I have just installed OCF and see that the log error shows that some tables do not exist.

(v0.3) I get this error: 'AttributeError: 'module' object has no attribute 'XMLField' before getting a fatal error.'

(v0.3) When trying to install I get the following: 'ImportError: No module named pypelib.Ruletable'

I try to add an Openflow Aggregate Manager and get this error: 'user X is not a clearinghouse user'.

python-pyparsing dependency not showing

Optin Manager

(v0.3) I get this error: 'django.core.exceptions.ImproperlyConfigured: settings.DATABASES is improperly configured. Please supply the ENGINE value. Check settings documentation for more details.'

Requests from the Web UI do not appear as Requests. Where are they?

VT AM

(v0.3) I get a 'SyntaxError' in a settings file

When trying to access the GUI I find this in the Apache VM AM's log: 'ImportError: No module named pypelib.persistence.backends.django'

Error 111: Connection refused

VMs appear in the VM AM GUI and are being created, but state in the Fronted is not updated

Error 39: Directory not empty

Cannot create VMs on server

Error 113: No route to host

XEN server networking: interconnection between two VMs in the same server through data-path is possible, without any OF rule.

Error: shutdown() takes exactly 0 arguments (1 given)

Miscellaneous

(v0.8) Apache2 processes block host machine

Clone this wiki locally

Requests from the Web UI do not appear as `Requests`. Where are they?