Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error to deploy spaceflights tutorial in AWS Steps Function #1006

Closed
andrea-cadeddu opened this issue Oct 29, 2021 · 15 comments · Fixed by #2045
Closed

Error to deploy spaceflights tutorial in AWS Steps Function #1006

andrea-cadeddu opened this issue Oct 29, 2021 · 15 comments · Fixed by #2045
Assignees
Labels
Issue: Bug Report 🐞 Bug that needs to be fixed

Comments

@andrea-cadeddu
Copy link

Description

I tried to follow the tutorial to deploy the spaceflights template project on AWS Step Functions (https://antonymilneqb.github.io/kedro-docs/10_deployment/10_aws_step_functions.html). I follow the steps written in the tutorial but I get an error when I try to run the service.

Context

Steps to Reproduce

  1. python -m venv venv & source venv/bin/activate
  2. pip install kedro==0.17.1 & kedro info
  3. kedro new --starter=spaceflights
  4. cd space (project folder)
  5. create new configuration environment to prepare a compatible data catalog (conf/aws/catalog.yml) with the aws file paths
  6. pip install -r src/requirements.txt
  7. kedro package
  8. create lambda_handler.py with a copy the code available in the tutorial and changed the row 7 in configure_project("space")
  9. create Dockerfile with a copy the code available in the tutorial
  10. push the project on ECR
  11. pip install deploy_requirements.txt
  12. go in the script deploy.py with a copy the code available in the tutorial and changed the rows 31 to s3_data_bucket_name = ("<bucket-name>") with my bucket name
  13. create the file cdk.json with a copy the code available in the tutorial & run cdk deploy
  14. go to aws and try to run the step function, which is deployed but it throws an error

Expected Result

The service step function starts the execution correctly.

Actual Result

The step functions throws an error in the first step, it seems that the lambda can't find the functions implemented.

-- Error Received.

{
  "errorMessage": "[Errno 38] Function not implemented",
  "errorType": "OSError",
  "requestId": "xxxxxxxxxxxxxxxxxxxxxxxxxxxx",
  "stackTrace": [
    "  File \"/home/app/lambda_handler.py\", line 7, in handler\n    configure_project(\"space\")\n",
    "  File \"/home/app/kedro/framework/project/__init__.py\", line 211, in configure_project\n    settings.configure(settings_module)\n",
    "  File \"/home/app/dynaconf/base.py\", line 223, in configure\n    self._wrapped = Settings(settings_module=settings_module, **kwargs)\n",
    "  File \"/home/app/dynaconf/base.py\", line 271, in __init__\n    self.validators.validate()\n",
    "  File \"/home/app/dynaconf/validator.py\", line 318, in validate\n    validator.validate(self.settings)\n",
    "  File \"/home/app/dynaconf/validator.py\", line 172, in validate\n    self._validate_items(settings, settings.current_env)\n",
    "  File \"/home/app/kedro/framework/project/__init__.py\", line 58, in _validate_items\n    super()._validate_items(settings, env)\n",
    "  File \"/home/app/dynaconf/validator.py\", line 183, in _validate_items\n    self.default(settings, self)\n",
    "  File \"/home/app/kedro/framework/project/__init__.py\", line 49, in validator_func\n    return getattr(importlib.import_module(module), class_name)\n",
    "  File \"/usr/local/lib/python3.8/importlib/__init__.py\", line 127, in import_module\n    return _bootstrap._gcd_import(name[level:], package, level)\n",
    "  File \"<frozen importlib._bootstrap>\", line 1014, in _gcd_import\n",
    "  File \"<frozen importlib._bootstrap>\", line 991, in _find_and_load\n",
    "  File \"<frozen importlib._bootstrap>\", line 961, in _find_and_load_unlocked\n",
    "  File \"<frozen importlib._bootstrap>\", line 219, in _call_with_frames_removed\n",
    "  File \"<frozen importlib._bootstrap>\", line 1014, in _gcd_import\n",
    "  File \"<frozen importlib._bootstrap>\", line 991, in _find_and_load\n",
    "  File \"<frozen importlib._bootstrap>\", line 975, in _find_and_load_unlocked\n",
    "  File \"<frozen importlib._bootstrap>\", line 671, in _load_unlocked\n",
    "  File \"<frozen importlib._bootstrap_external>\", line 843, in exec_module\n",
    "  File \"<frozen importlib._bootstrap>\", line 219, in _call_with_frames_removed\n",
    "  File \"/home/app/kedro/framework/session/__init__.py\", line 32, in <module>\n    from .session import KedroSession, get_current_session\n",
    "  File \"/home/app/kedro/framework/session/session.py\", line 54, in <module>\n    from kedro.framework.session.store import BaseSessionStore\n",
    "  File \"/home/app/kedro/framework/session/store.py\", line 73, in <module>\n    class ShelveStore(BaseSessionStore):\n",
    "  File \"/home/app/kedro/framework/session/store.py\", line 76, in ShelveStore\n    _lock = Lock()\n",
    "  File \"/usr/local/lib/python3.8/multiprocessing/context.py\", line 68, in Lock\n    return Lock(ctx=self.get_context())\n",
    "  File \"/usr/local/lib/python3.8/multiprocessing/synchronize.py\", line 162, in __init__\n    SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)\n",
    "  File \"/usr/local/lib/python3.8/multiprocessing/synchronize.py\", line 57, in __init__\n    sl = self._semlock = _multiprocessing.SemLock(\n"
  ]
}

Your Environment

Include as many relevant details about the environment in which you experienced the bug:

  • Kedro version used: 0.17.1
  • Python version used: 3.8.10
  • Operating system and version: Linux Ubuntu 20.04 LTS
  • cdk version used: 1.128.0
  • npm version used: 6.14.15
@andrea-cadeddu andrea-cadeddu added the Issue: Bug Report 🐞 Bug that needs to be fixed label Oct 29, 2021
@andrea-cadeddu andrea-cadeddu changed the title <Title> Error to deploy spaceflights in AWS Steps Function Oct 29, 2021
@andrea-cadeddu andrea-cadeddu changed the title Error to deploy spaceflights in AWS Steps Function Error to deploy spaceflights tutorial in AWS Steps Function Oct 29, 2021
@makennedy626
Copy link

I have been experiencing this same issue.

@antonymilne
Copy link
Contributor

Hi @andrea-cadeddu, thanks for letting us know - we're looking into this for you. Unrelated to the issue, but I'm curious how you found the docs at that URL? They're some which I had uploaded to my personal page just to try something out so a bit surprised you ended up there rather than the official docs 😅

@limdauto
Copy link
Contributor

limdauto commented Nov 1, 2021

@andrea-cadeddu @makennedy626 Hi both, thanks for the report. This doc is a bit outdated and needs updating. Will fix it. In the mean time, I'd recommend using AWS Batch or just a plain docker container to run your pipeline on AWS: https://kedro.readthedocs.io/en/latest/10_deployment/07_aws_batch.html

@datajoely datajoely assigned datajoely and limdauto and unassigned datajoely Nov 2, 2021
@stale
Copy link

stale bot commented Jan 1, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 1, 2022
@stale stale bot closed this as completed Jan 8, 2022
@jccalvojackson
Copy link
Contributor

same error here, even though with patch("multiprocessing.Lock"): in lambda_handler.py is supposed to handle this error.

@jccalvojackson
Copy link
Contributor

also, the tutorial should be updated to use cdk v2.

@datajoely
Copy link
Contributor

I think we need to reopen this one

@datajoely datajoely reopened this Jun 27, 2022
@antonymilne antonymilne removed the stale label Jun 27, 2022
@antonymilne
Copy link
Contributor

antonymilne commented Jun 27, 2022

@jccalvojackson: please could you give the full traceback for this? Thank you!

In more detail, this is why we should update to use CDK v2:

The CDK has been released in two major versions, v1 and v2. This is the Developer Guide for AWS CDK v1.
CDK v1 entered maintenance on June 1, 2022. During the maintenance phase, CDK v1 will receive critical bug fixes and security patches only. New features will be developed exclusively for CDK v2 during the v1 maintenance phase. On June 1, 2023, support will end for AWS CDK v1. For more details, see AWS CDK Maintenance Policy.

@AlbertoGarau
Copy link

AlbertoGarau commented Sep 23, 2022

I have tried to follow the tutorial but I faced the same problem described early by @andrea-cadeddu
Thank you!

{
  "errorMessage": "[Errno 38] Function not implemented",
  "errorType": "OSError",
  "requestId": "aeb69066-812d-4e7c-9877-8c45c7809074",
  "stackTrace": [
    "  File \"/home/app/lambda_handler.py\", line 7, in handler\n    configure_project(\"spaceflights_steps_function\")\n",
    "  File \"/home/app/kedro/framework/project/__init__.py\", line 243, in configure_project\n    settings.configure(settings_module)\n",
    "  File \"/home/app/dynaconf/base.py\", line 193, in configure\n    self._wrapped = Settings(settings_module=settings_module, **kwargs)\n",
    "  File \"/home/app/dynaconf/base.py\", line 256, in __init__\n    self.validators.validate(\n",
    "  File \"/home/app/dynaconf/validator.py\", line 461, in validate\n    validator.validate(\n",
    "  File \"/home/app/kedro/framework/project/__init__.py\", line 46, in validate\n    super().validate(settings, *args, **kwargs)\n",
    "  File \"/home/app/dynaconf/validator.py\", line 213, in validate\n    self._validate_items(\n",
    "  File \"/home/app/dynaconf/validator.py\", line 242, in _validate_items\n    self.default(settings, self)\n",
    "  File \"/home/app/kedro/framework/project/__init__.py\", line 37, in validator_func\n    return getattr(importlib.import_module(module), class_name)\n",
    "  File \"/usr/local/lib/python3.8/importlib/__init__.py\", line 127, in import_module\n    return _bootstrap._gcd_import(name[level:], package, level)\n",
    "  File \"<frozen importlib._bootstrap>\", line 1014, in _gcd_import\n",
    "  File \"<frozen importlib._bootstrap>\", line 991, in _find_and_load\n",
    "  File \"<frozen importlib._bootstrap>\", line 961, in _find_and_load_unlocked\n",
    "  File \"<frozen importlib._bootstrap>\", line 219, in _call_with_frames_removed\n",
    "  File \"<frozen importlib._bootstrap>\", line 1014, in _gcd_import\n",
    "  File \"<frozen importlib._bootstrap>\", line 991, in _find_and_load\n",
    "  File \"<frozen importlib._bootstrap>\", line 975, in _find_and_load_unlocked\n",
    "  File \"<frozen importlib._bootstrap>\", line 671, in _load_unlocked\n",
    "  File \"<frozen importlib._bootstrap_external>\", line 843, in exec_module\n",
    "  File \"<frozen importlib._bootstrap>\", line 219, in _call_with_frames_removed\n",
    "  File \"/home/app/kedro/framework/session/__init__.py\", line 4, in <module>\n    from .session import KedroSession\n",
    "  File \"/home/app/kedro/framework/session/session.py\", line 26, in <module>\n    from kedro.framework.session.store import BaseSessionStore\n",
    "  File \"/home/app/kedro/framework/session/store.py\", line 46, in <module>\n    class ShelveStore(BaseSessionStore):\n",
    "  File \"/home/app/kedro/framework/session/store.py\", line 49, in ShelveStore\n    _lock = Lock()\n",
    "  File \"/usr/local/lib/python3.8/multiprocessing/context.py\", line 68, in Lock\n    return Lock(ctx=self.get_context())\n",
    "  File \"/usr/local/lib/python3.8/multiprocessing/synchronize.py\", line 162, in __init__\n    SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)\n",
    "  File \"/usr/local/lib/python3.8/multiprocessing/synchronize.py\", line 57, in __init__\n    sl = self._semlock = _multiprocessing.SemLock(\n"
  ]
}```

@merelcht
Copy link
Member

merelcht commented Oct 3, 2022

Hi @AlbertoGarau , thanks for flagging! Could you specify which version you are using of the following:

  • Kedro
  • Python
  • Operating system and version
  • cdk

@AlbertoGarau
Copy link

Hello @MerelTheisenQB
My tools version:

  • kedro, version 0.18.3
  • Python 3.10.6
  • cdk 2.43.1 (build c1ebb85)
  • No LSB modules are available.
    Distributor ID: Ubuntu
    Description: Ubuntu 22.04.1 LTS
    Release: 22.04
    Codename: jammy

@suryansh2799
Copy link

I have been facing the same issue recently, let us know the fix or way around this. Thank you in advance :)

@merelcht
Copy link
Member

  • Rewrite the docs
  • Write up clearly which version of Kedro, AWS Step Function, Python and other relevant libraries were used when the guide was written

@ankatiyar ankatiyar self-assigned this Nov 3, 2022
@ankatiyar
Copy link
Contributor

This issue comes from the import of ShelveStore which uses multiprocessing.lock which isn't supported on AWS Lambda functions. ShelveStore has been moved out to its own module in #1614, therefore, will not be imported, which will resolve this issue in the upcoming release of Kedro 0.18.4.

@ankatiyar
Copy link
Contributor

ankatiyar commented Nov 29, 2022

Possible workaround in the meanwhile -

  • Use the latest version of Kedro directly from github to create your project. To do this pip install git+https://github.com/kedro-org/kedro@main.
  • In the src/requirements.txt in your project - replace kedro~=0.18.3 with kedro @ git+https://github.com/kedro-org/kedro@main. Also add @ git+https://github.com/kedro-org/kedro@main to whatever DataSets are in the requirements. (For spaceflights tutorial this would be kedro[pandas.CSVDataSet, pandas.ExcelDataSet, pandas.ParquetDataSet] @ git+https://github.com/kedro-org/kedro@main)
    This uses the latest Kedro code during the building of docker image.
  • I also removed the kedro-telemetry requirement from requirements.txt because this was throwing a run time error during the execution of the AWS step function state machine.
    Should work after this! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Bug Report 🐞 Bug that needs to be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants