Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure: Update to Azure config [WIP] #406

Merged
merged 4 commits into from
Nov 1, 2022
Merged

Conversation

adamrtalbot
Copy link
Contributor

@adamrtalbot adamrtalbot commented Aug 24, 2022

Additions:

  • More Azure specific features included in NF-Core config

Relates to #401


name: Azure
about: Add more features to Azure config

Please follow these steps before submitting your PR:

  • If your PR is a work in progress, include [WIP] in its title
  • Your PR targets the master branch
  • You've included links to relevant issues, if any

Steps for adding a new config profile:

  • Add your custom config file to the conf/ directory
  • Add your documentation file to the docs/ directory
  • Add your custom profile to the nfcore_custom.config file in the top-level directory
  • Add your custom profile to the README.md file in the top-level directory
  • Add your profile name to the profile: scope in .github/workflows/main.yml

Additions:
 - More Azure specific features included in NF-Core config

Relates to #401
conf/azurebatch.config Outdated Show resolved Hide resolved
storage {
accountName = params.storage_name
accountKey = params.storage_key
sasToken = params.storage_key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be params.storage_sas

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That explains why it wasn't working...


The Azure Blob Storage shared access signature token.
VM size to use with Nextflow autopool or when creating a worker pool in Azure Batch. Make sure your Azure account has sufficient quota. Defaults to `Standard_D8s_v3`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@vsmalladi vsmalladi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good extension and provides more options to do vm_types. A couple of changes and additions.

Copy link
Contributor

@vsmalladi vsmalladi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jfy133 jfy133 self-requested a review August 30, 2022 09:33
@jfy133
Copy link
Member

jfy133 commented Aug 30, 2022

Check if nextflow config runs in repository root tests can be ignored, it's a bug in NXF

@adamrtalbot
Copy link
Contributor Author

@vsmalladi I haven't had a chance to check this. I'm on leave next week but when I get a chance I will give it a go. If it has everything we need and doesn't break anything we should smash that merge button.

@adamrtalbot
Copy link
Contributor Author

adamrtalbot commented Oct 25, 2022

@vsmalladi I'm finding this raises an error with the autoscaling formula. Works fine if autoscalling is turned off. Have you found this?

Caused by:
  Status code 400, {
  "odata.metadata":"REDACTED/$metadata#Microsoft.Azure.Batch.Protocol.Entities.Container.errors/@Element","code":"AutoScalingFormulaEvaluationError","message":{
    "lang":"en-US","value":"The specified auto-scaling formula has evaluation error\nRequestId:e89088e4-6879-4f4c-9d49-f413864b358b\nTime:2022-10-25T15:44:23.3590143Z"
  },"values":[
    {
      "key":"Message","value":"Line 3, Col 25: Argument 1 is invalid"
    },{
      "key":"Result","value":"$TargetDedicatedNodes=0;$TargetLowPriorityNodes=0;$NodeDeallocationOption=requeue"
    }
  ]
}

@vsmalladi
Copy link
Contributor

@adamrtalbot That is odd. I have not found that issue. Can you see your quota for Dedicated nodes in Dv3 family?

@adamrtalbot
Copy link
Contributor Author

Looks like plenty of nodes. If I turn off autoscaling it's fine. If I add a custom autoscaling formula to the pool it's fine:

pools {
          auto {
              vmType       = params.vm_type
              autoScale    = true
              vmCount      = 1
              maxVmCount   = 12
              scaleFormula = '''
                  $TargetLowPriorityNodes = 1;
                  $TargetDedicatedNodes   = 0;
                  $NodeDeallocationOption = taskcompletion;
              '''
          }
      }

Is the default autoscaling formula in Nextflow broken?

@vsmalladi
Copy link
Contributor

@adamrtalbot What version of nextflow, and pipeline are you testing.

I can give it a try on my end.

@adamrtalbot
Copy link
Contributor Author

nextflow -version

      N E X T F L O W
      version 22.04.5 build 5708
      created 15-07-2022 16:09 UTC 
      cite doi:10.1038/nbt.3820
      http://nextflow.io
java -version

openjdk version "17.0.4.1" 2022-08-12
OpenJDK Runtime Environment Temurin-17.0.4.1+1 (build 17.0.4.1+1)
OpenJDK 64-Bit Server VM Temurin-17.0.4.1+1 (build 17.0.4.1+1, mixed mode, sharing)

It fails with every pipeline, for example:

nextflow run \
    nf-core/sarek \
    -profile test \
    -c ~/configs/conf/azurebatch.config \
    -ansi-log false \
    -w az://work/ \
    --az_location $AZURE_BATCH_LOCATION \
    --batch_name $AZURE_BATCH_ACCOUNT_NAME \
    --batch_key $AZURE_BATCH_ACCOUNT_KEY \
    --storage_name $AZURE_STORAGE_ACCOUNT_NAME \
    --storage_key $AZURE_STORAGE_ACCOUNT_KEY

@vsmalladi
Copy link
Contributor

@adamrtalbot I just tested this config and had no issues running it as is.

  N E X T F L O W
  version 22.10.0 build 5826
  created 13-10-2022 05:44 UTC (00:44 CDT)
  cite doi:10.1038/nbt.3820
  http://nextflow.io

The scaling is targeting low priority nodes, by default the autoscale would use dedicated nodes. So I wonder if you look at quotas in your batch account under Total dedicated vCPUs and Dv3 Series section to see if its 0. If its 0 then I think that would be the issue.

@adamrtalbot
Copy link
Contributor Author

@vsmalladi if you had no issues, shall we just go for it?

@vsmalladi
Copy link
Contributor

@adamrtalbot ya lets merge and see if anyone brings up any issues that we can solve.

@adamrtalbot adamrtalbot merged commit 028ce1e into master Nov 1, 2022
@adamrtalbot adamrtalbot deleted the azure_config_update branch November 1, 2022 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants