Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sf data query resume: cannot find resource for bulk query results in cli #2968

Closed
deployment-ian opened this issue Jul 24, 2024 · 15 comments
Closed
Labels
area:jsforce bug Issue or pull request that identifies or fixes a bug validated Version information for this issue has been validated

Comments

@deployment-ian
Copy link

deployment-ian commented Jul 24, 2024

Summary

I am using @salesforce/cli/2.50.6 win32-x64 node-v20.15.0

I am trying to extract data using a bulk query. This is a significant amount of data in this query request:

C:\Users\{user}\Desktop>sf data query --query "{query}" --target-org {org} --bulk --async
Query ID: {job_id}
Query is in progress.

Run "data query resume -i {job_id} -o {org}" to get the latest status and results.

Total number of records retrieved: 0.
Querying Data... done

I then checked the status of the job:

C:\Users\{user}\Desktop>sf data resume -i {job_id} -o {org}

=== Job Status

id:                      {job_id}
operation:               query
object:                  Account
createdById:             {user_id}
createdDate:             2024-07-24T12:34:14.000Z
systemModstamp:          2024-07-24T12:43:00.000Z
state:                   JobComplete
concurrencyMode:         Parallel
contentType:             CSV
numberBatchesQueued:     0
numberBatchesInProgress: 0
numberBatchesCompleted:  92
numberBatchesFailed:     0
numberBatchesTotal:      92
numberRecordsProcessed:  21165218
numberRetries:           0
apiVersion:              61.0
numberRecordsFailed:     0
totalProcessingTime:     2304433
apiActiveProcessingTime: 2304433
apexProcessingTime:      0
Getting Status... done

Great! It ran successfully. But when I try to extract the data, I receive this information that the resource does not exist:

C:\Users\{user}\Desktop>sf data query resume -i {job_id} -o {org} -r csv  > test.csv
Error (1): The requested resource does not exist

Where did the resource go? How do I extract these batch results into a csv?

System Information

{
  "architecture": "win32-x64",
  "cliVersion": "@salesforce/cli/2.50.6",
  "nodeVersion": "node-v20.15.0",
  "osVersion": "Windows_NT 10.0.19045",
  "rootPath": "C:\\Users\\{user}\\AppData\\Local\\sf\\client\\2.50.6-6e9d7f7",
  "shell": "cmd.exe",
  "pluginVersions": [
    "@oclif/plugin-autocomplete 3.1.6 (core)",
    "@oclif/plugin-commands 4.0.6 (core)",
    "@oclif/plugin-help 6.2.5 (core)",
    "@oclif/plugin-not-found 3.2.10 (core)",
    "@oclif/plugin-plugins 5.3.4 (core)",
    "@oclif/plugin-search 1.2.2 (core)",
    "@oclif/plugin-update 4.4.7 (core)",
    "@oclif/plugin-version 2.2.6 (core)",
    "@oclif/plugin-warn-if-update-available 3.1.8 (core)",
    "@oclif/plugin-which 3.2.6 (core)",
    "@salesforce/cli 2.50.6 (core)",
    "apex 3.2.4 (core)",
    "auth 3.6.33 (core)",
    "data 3.5.5 (core)",
    "deploy-retrieve 3.9.13 (core)",
    "env 3.0.32 (user) published 41 days ago (Wed Jun 12 2024)",
    "info 3.3.17 (core)",
    "limits 3.3.17 (core)",
    "marketplace 1.2.17 (core)",
    "org 4.3.4 (core)",
    "packaging 2.7.0 (core)",
    "schema 3.3.19 (core)",
    "settings 2.3.8 (core)",
    "sobject 1.4.19 (core)",
    "source 3.5.3 (core)",
    "telemetry 3.6.1 (core)",
    "templates 56.3.2 (core)",
    "trust 3.7.13 (core)",
    "user 3.5.17 (core)"
  ]
}
@deployment-ian deployment-ian added the investigating We're actively investigating this issue label Jul 24, 2024
Copy link

Thank you for filing this issue. We appreciate your feedback and will review the issue as soon as possible. Remember, however, that GitHub isn't a mechanism for receiving support under any agreement or SLA. If you require immediate assistance, contact Salesforce Customer Support.

@github-actions github-actions bot added the validated Version information for this issue has been validated label Jul 24, 2024
@mdonnalley
Copy link
Contributor

@deployment-ian thanks for the issue

The requested resource does not exist is directly from the API. In your case, I think it might indicate that you queried data that you don't have permission to view? For example, one of the Accounts returned in the query might be set to private

@mdonnalley mdonnalley added the more information required Issue requires more information or a response from the customer label Jul 24, 2024
@deployment-ian
Copy link
Author

I don't believe this is to be the case. When I adjust my query so that it only returns 1.8m records (down from 21m), the command C:\Users\{user}\Desktop>sf data query resume -i {job_id} -o {org} -r csv > test.csv functions as expected and I receive the data

@deployment-ian
Copy link
Author

Also, when I utilize workbench with the same query as I use and the resource doesn't appear, workbench is able to extract al of the data

@mdonnalley
Copy link
Contributor

Got it. Can you set JSFORCE_LOG_LEVEL=DEBUG in your environment and share the output from the failing command? I'm hoping that will give us some insight as to where the API is returning resource not found - thanks!

@deployment-ian
Copy link
Author

sure!

I set the flag, received lots of additional information, but no extra detail when executing the sf data query resume command:

C:\Users\{user}\Desktop>set JSFORCE_LOG_LEVEL=DEBUG

C:\Users\Ian{user}\Desktop>sf data query --query "{query}" --target-org {org} --bulk --async
DEBUG   [http-api]  missing 'content-length' header, setting it to: 301
DEBUG   [http-api]  <request> method=POST, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query
DEBUG   [http-api]  elapsed time: 238 msec
DEBUG   [http-api]  <response> status=401, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query
DEBUG   [bulk2:QueryJobV2]  sforce-locator: undefined
INFO    [session-refresh-delegate]  <refresh token>
DEBUG   [session-refresh-delegate]  Connection refresh completed.
INFO    [session-refresh-delegate]  <refresh complete>
DEBUG   [http-api]  <request> method=POST, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query
DEBUG   [http-api]  elapsed time: 67 msec
DEBUG   [http-api]  <response> status=401, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query
DEBUG   [bulk2:QueryJobV2]  sforce-locator: undefined
INFO    [session-refresh-delegate]  <refresh token>
DEBUG   [session-refresh-delegate]  Connection refresh completed.
INFO    [session-refresh-delegate]  <refresh complete>
DEBUG   [http-api]  <request> method=POST, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query
DEBUG   [http-api]  elapsed time: 376 msec
DEBUG   [http-api]  <response> status=200, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query
DEBUG   [bulk2:QueryJobV2]  sforce-locator: undefined
DEBUG   [bulk2:QueryJobV2]  Successfully created job 750Hn00000OUfXPIA1
Query ID: 750Hn00000OUfXPIA1
Query is in progress.

Run "data query resume -i 750Hn00000OUfXPIA1-o {user}" to get the latest status and results.


Total number of records retrieved: 0.
Querying Data... done

C:\Users\{user}\Desktop>sf data resume -i 750Hn00000OUfXPIA1 -o {org} --json
DEBUG   [http-api]  <request> method=GET, url=https://officedepot.my.salesforce.com/services/async/61.0/job/750Hn00000OUfXPIA1
DEBUG   [http-api]  elapsed time: 998 msec
DEBUG   [http-api]  <response> status=200, url=https://officedepot.my.salesforce.com/services/async/61.0/job/750Hn00000OUfXPIA1
DEBUG   [connection]  {
  '$': { xmlns: 'http://www.force.com/2009/06/asyncapi/dataload' },
  id: '750Hn00000OUfXPIA1',
  operation: 'query',
  object: 'Account',
  createdById: '{user_id}',
  createdDate: '2024-07-24T19:04:32.000Z',
  systemModstamp: '2024-07-24T19:04:33.000Z',
  state: 'InProgress',
  concurrencyMode: 'Parallel',
  contentType: 'CSV',
  numberBatchesQueued: '0',
  numberBatchesInProgress: '6',
  numberBatchesCompleted: '10',
  numberBatchesFailed: '0',
  numberBatchesTotal: '16',
  numberRecordsProcessed: '2373669',
  numberRetries: '0',
  apiVersion: '61.0',
  numberRecordsFailed: '0',
  totalProcessingTime: '134861',
  apiActiveProcessingTime: '134861',
  apexProcessingTime: '0'
}
{
  "status": 0,
  "result": {
    "$": {
      "xmlns": "http://www.force.com/2009/06/asyncapi/dataload"
    },
    "id": "750Hn00000OUfXPIA1",
    "operation": "query",
    "object": "Account",
    "createdById": "{user_id}",
    "createdDate": "2024-07-24T19:04:32.000Z",
    "systemModstamp": "2024-07-24T19:04:33.000Z",
    "state": "InProgress",
    "concurrencyMode": "Parallel",
    "contentType": "CSV",
    "numberBatchesQueued": "0",
    "numberBatchesInProgress": "6",
    "numberBatchesCompleted": "10",
    "numberBatchesFailed": "0",
    "numberBatchesTotal": "16",
    "numberRecordsProcessed": "2373669",
    "numberRetries": "0",
    "apiVersion": "61.0",
    "numberRecordsFailed": "0",
    "totalProcessingTime": "134861",
    "apiActiveProcessingTime": "134861",
    "apexProcessingTime": "0"
  },
  "warnings": []
}

C:\Users\{user}\Desktop>sf data resume -i 750Hn00000OUfXPIA1 -o {org} --json
DEBUG   [http-api]  <request> method=GET, url=https://officedepot.my.salesforce.com/services/async/61.0/job/750Hn00000OUfXPIA1
DEBUG   [http-api]  elapsed time: 1178 msec
DEBUG   [http-api]  <response> status=200, url=https://officedepot.my.salesforce.com/services/async/61.0/job/750Hn00000OUfXPIA1
DEBUG   [connection]  {
  '$': { xmlns: 'http://www.force.com/2009/06/asyncapi/dataload' },
  id: '750Hn00000OUfXPIA1',
  operation: 'query',
  object: 'Account',
  createdById: '{user_id}',
  createdDate: '2024-07-24T19:04:32.000Z',
  systemModstamp: '2024-07-24T19:08:52.000Z',
  state: 'JobComplete',
  concurrencyMode: 'Parallel',
  contentType: 'CSV',
  numberBatchesQueued: '0',
  numberBatchesInProgress: '0',
  numberBatchesCompleted: '92',
  numberBatchesFailed: '0',
  numberBatchesTotal: '92',
  numberRecordsProcessed: '21169518',
  numberRetries: '0',
  apiVersion: '61.0',
  numberRecordsFailed: '0',
  totalProcessingTime: '1217556',
  apiActiveProcessingTime: '1217556',
  apexProcessingTime: '0'
}
{
  "status": 0,
  "result": {
    "$": {
      "xmlns": "http://www.force.com/2009/06/asyncapi/dataload"
    },
    "id": "750Hn00000OUfXPIA1",
    "operation": "query",
    "object": "Account",
    "createdById": "{user_id}",
    "createdDate": "2024-07-24T19:04:32.000Z",
    "systemModstamp": "2024-07-24T19:08:52.000Z",
    "state": "JobComplete",
    "concurrencyMode": "Parallel",
    "contentType": "CSV",
    "numberBatchesQueued": "0",
    "numberBatchesInProgress": "0",
    "numberBatchesCompleted": "92",
    "numberBatchesFailed": "0",
    "numberBatchesTotal": "92",
    "numberRecordsProcessed": "21169518",
    "numberRetries": "0",
    "apiVersion": "61.0",
    "numberRecordsFailed": "0",
    "totalProcessingTime": "1217556",
    "apiActiveProcessingTime": "1217556",
    "apexProcessingTime": "0"
  },
  "warnings": []
}


C:\Users\{user}\Desktop>sf data query resume -i {query_id}-o {org} > test.csv
Error (1): The requested resource does not exist

@mdonnalley
Copy link
Contributor

C:\Users\{user}\Desktop>sf data query resume -i {query_id}-o {org} > test.csv will redirect all the stdout to test.csv - so that file should contain all the logs from that command

@deployment-ian
Copy link
Author

duh!

"DEBUG	[bulk2:QueryJobV2]  Start polling for job status"	
"DEBUG	[bulk2:QueryJobV2]  Polling options: timeout:1000ms | interval: 0ms."	
"DEBUG	[http-api]  <request> method=GET"	 url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query/750Hn00000OUfXPIA1
"DEBUG	[http-api]  elapsed time: 585 msec"	
"DEBUG	[http-api]  <response> status=200"	 url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query/750Hn00000OUfXPIA1
"DEBUG	[bulk2:QueryJobV2]  sforce-locator: undefined"	
"DEBUG	[bulk2:QueryJobV2]  Job 750Hn00000OUfXPIA1 was successfully processed."	
"DEBUG	[http-api]  <request> method=GET"	 url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query/750Hn00000OUfXPIA1/results
"DEBUG	[http-api]  elapsed time: 16135 msec"	
"DEBUG	[http-api]  <response> status=200"	 url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query/750Hn00000OUfXPIA1/results
"DEBUG	[bulk2:QueryJobV2]  sforce-locator: MDowR2FIbjAwMDAwNXMyVDRLQUk6MQ"	
"DEBUG	[http-api]  <request> method=GET"	 url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query//750Hn00000OUfXPIA1/results?locator=MDowR2FIbjAwMDAwNXMyVDRLQUk6MQ
"DEBUG	[http-api]  elapsed time: 67 msec"	
"DEBUG	[http-api]  <response> status=404"	 url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query//750Hn00000OUfXPIA1/results?locator=MDowR2FIbjAwMDAwNXMyVDRLQUk6MQ
"DEBUG	[bulk2:QueryJobV2]  sforce-locator: undefined"	

@deployment-ian
Copy link
Author

Here's the data as JSON:

DEBUG   [bulk2:QueryJobV2]  Start polling for job status
DEBUG   [bulk2:QueryJobV2]  Polling options: timeout:1000ms | interval: 0ms.
DEBUG   [http-api]  <request> method=GET, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query/750Hn00000OUfXPIA1
DEBUG   [http-api]  elapsed time: 530 msec
DEBUG   [http-api]  <response> status=200, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query/750Hn00000OUfXPIA1
DEBUG   [bulk2:QueryJobV2]  sforce-locator: undefined
DEBUG   [bulk2:QueryJobV2]  Job 750Hn00000OUfXPIA1 was successfully processed.
DEBUG   [http-api]  <request> method=GET, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query/750Hn00000OUfXPIA1/results
DEBUG   [http-api]  elapsed time: 17000 msec
DEBUG   [http-api]  <response> status=200, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query/750Hn00000OUfXPIA1/results
DEBUG   [bulk2:QueryJobV2]  sforce-locator: MDowR2FIbjAwMDAwNXMyVDRLQUk6MQ
DEBUG   [http-api]  <request> method=GET, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query//750Hn00000OUfXPIA1/results?locator=MDowR2FIbjAwMDAwNXMyVDRLQUk6MQ
DEBUG   [http-api]  elapsed time: 73 msec
DEBUG   [http-api]  <response> status=404, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query//750Hn00000OUfXPIA1/results?locator=MDowR2FIbjAwMDAwNXMyVDRLQUk6MQ
DEBUG   [bulk2:QueryJobV2]  sforce-locator: undefined
{
  "name": "NOT_FOUND",
  "message": "The requested resource does not exist",
  "exitCode": 1,
  "context": "BulkQueryReport",
  "stack": "NOT_FOUND: The requested resource does not exist\n    at SfCommandError.from (file:///C:/Users/Ian-Bates/AppData/Local/sf/client/2.50.6-6e9d7f7/node_modules/@salesforce/sf-plugins-core/lib/SfCommandError.js:39:16)\n    at BulkQueryReport.catch (file:///C:/Users/Ian-Bates/AppData/Local/sf/client/2.50.6-6e9d7f7/node_modules/@salesforce/sf-plugins-core/lib/sfCommand.js:315:47)\n    at BulkQueryReport._run (C:\\Users\\Ian-Bates\\AppData\\Local\\sf\\client\\2.50.6-6e9d7f7\\node_modules\\@oclif\\core\\lib\\command.js:306:29)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async Config.runCommand (C:\\Users\\Ian-Bates\\AppData\\Local\\sf\\client\\2.50.6-6e9d7f7\\node_modules\\@oclif\\core\\lib\\config\\config.js:424:25)\n    at async run (C:\\Users\\Ian-Bates\\AppData\\Local\\sf\\client\\2.50.6-6e9d7f7\\node_modules\\@oclif\\core\\lib\\main.js:95:16)\n    at async file:///C:/Users/Ian-Bates/AppData/Local/sf/client/2.50.6-6e9d7f7/bin/run.js:15:1",
  "cause": "NOT_FOUND: The requested resource does not exist\n    at BulkApiV2.getError (C:\\Users\\Ian-Bates\\AppData\\Local\\sf\\client\\2.50.6-6e9d7f7\\node_modules\\@jsforce\\jsforce-node\\lib\\http-api.js:279:16)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async C:\\Users\\Ian-Bates\\AppData\\Local\\sf\\client\\2.50.6-6e9d7f7\\node_modules\\@jsforce\\jsforce-node\\lib\\http-api.js:127:33\n    at async QueryJobV2.result (C:\\Users\\Ian-Bates\\AppData\\Local\\sf\\client\\2.50.6-6e9d7f7\\node_modules\\@jsforce\\jsforce-node\\lib\\api\\bulk2.js:359:13)\n    at async BulkQueryReport.run (file:///C:/Users/Ian-Bates/AppData/Local/sf/client/2.50.6-6e9d7f7/node_modules/@salesforce/plugin-data/lib/commands/data/query/resume.js:50:25)\n    at async BulkQueryReport._run (C:\\Users\\Ian-Bates\\AppData\\Local\\sf\\client\\2.50.6-6e9d7f7\\node_modules\\@oclif\\core\\lib\\command.js:302:22)\n    at async Config.runCommand (C:\\Users\\Ian-Bates\\AppData\\Local\\sf\\client\\2.50.6-6e9d7f7\\node_modules\\@oclif\\core\\lib\\config\\config.js:424:25)\n    at async run (C:\\Users\\Ian-Bates\\AppData\\Local\\sf\\client\\2.50.6-6e9d7f7\\node_modules\\@oclif\\core\\lib\\main.js:95:16)\n    at async file:///C:/Users/Ian-Bates/AppData/Local/sf/client/2.50.6-6e9d7f7/bin/run.js:15:1 {\n  errorCode: 'NOT_FOUND',\n  content: undefined\n}",
  "warnings": [],
  "code": "1",
  "status": 1,
  "commandName": "BulkQueryReport"
}

@mdonnalley
Copy link
Contributor

@deployment-ian Thanks for the logs!

Based on these logs,

DEBUG   [http-api]  <request> method=GET, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query/750Hn00000OUfXPIA1/results
DEBUG   [http-api]  elapsed time: 17000 msec
DEBUG   [http-api]  <response> status=200, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query/750Hn00000OUfXPIA1/results
DEBUG   [bulk2:QueryJobV2]  sforce-locator: MDowR2FIbjAwMDAwNXMyVDRLQUk6MQ
DEBUG   [http-api]  <request> method=GET, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query//750Hn00000OUfXPIA1/results?locator=MDowR2FIbjAwMDAwNXMyVDRLQUk6MQ
DEBUG   [http-api]  elapsed time: 73 msec
DEBUG   [http-api]  <response> status=404, url=https://officedepot.my.salesforce.com/services/data/v61.0/jobs/query//750Hn00000OUfXPIA1/results?locator=MDowR2FIbjAwMDAwNXMyVDRLQUk6MQ
DEBUG   [bulk2:QueryJobV2]  sforce-locator: undefined

It looks like jsforce is creating a url with a // in it, which is causing the 404. For what it's worth, we fixed a similar issue last week

I'll mark this as a bug and get it on our board.

@mdonnalley mdonnalley added bug Issue or pull request that identifies or fixes a bug and removed more information required Issue requires more information or a response from the customer investigating We're actively investigating this issue labels Jul 24, 2024
Copy link

git2gus bot commented Jul 24, 2024

This issue has been linked to a new work item: W-16325607

@deployment-ian
Copy link
Author

Thank you so much @mdonnalley! I really appreciate it.

And since we have this conversation going, is there any way batch-download the CSV? or will this work-flow always return all this data within 1 csv?

I don't see much documentation utilizing the bulk api 2.0 commands within the cli...

I appreciate you helping me out with this!

@mdonnalley
Copy link
Contributor

And since we have this conversation going, is there any way batch-download the CSV? or will this work-flow always return all this data within 1 csv?

There's no support for batch downloading the csv - it's always going to go into a single csv file. If you needed to do that, you could adjust your soql query to return a subset of the data and then run multiple sf data query --bulk with that

@cristiand391
Copy link
Member

And since we have this conversation going, is there any way batch-download the CSV? or will this work-flow always return all this data within 1 csv?

hi 👋🏼 @deployment-ian
We are discussing some new commands for bulk import/export with the team, would be interested to hear your use-case for splitting results into multiple files (you can post it here: https://github.com/forcedotcom/cli/discussions/categories/ideas)

FWIW, the Bulk 2.0 API doesn't let clients to batch data when uploading, so for a feature like this the CLI would need to implement some extra logic, see:
https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/bulk_common_diff_two_versions.htm

@deployment-ian
Copy link
Author

Hi @cristiand391,

I don't know right now if it's actually better for me to have a multiple-file result from the bulk query at this time. Once the issue above it resolved, I can then figure out if it would be beneficial to request this feature.

I will post my thoughts on the pages you provided here.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:jsforce bug Issue or pull request that identifies or fixes a bug validated Version information for this issue has been validated
Projects
None yet
Development

No branches or pull requests

4 participants