Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated Records from Odata pagination #288

Open
victorygit opened this issue Jan 15, 2025 · 6 comments
Open

Duplicated Records from Odata pagination #288

victorygit opened this issue Jan 15, 2025 · 6 comments

Comments

@victorygit
Copy link

Hello,

We are leveraging this Pyodata library to extract data from SAP successfactors, recently we found we get some duplicate data from EC module during the project. We didn't realized this for other module, not sure if it is due to many project changes in the system, do we know if we have similar issue and how to avoid these duplicated records?

We are using server side pagination and I see the same records show up in different page download.

Thanks
Victor

@phanak-sap
Copy link
Contributor

phanak-sap commented Jan 16, 2025

Hi @victorygit this is not much actionable thing on side of the library code. I do not have access to your service nor any kind of reproducible steps. You did not provide also any python snippet "_next" pagination? skip and top?) - which would be still without the service traffic just half of the problem.

I would recommend to first isolate if the problem is even on pyodata side or the duplicates truly exists in the service. The "not sure if it is due to many project changes in the system" shows me you do not know that actually.

I will consider that you have only options to use the API and not working in SAP. Still, it should be possible to check if the data are really duplicates (e.g. same primary key in two paginated response) or data that simply looks like duplicate data, but are actually different in primary keys (duplicates permitted in DB).

You can for example load al the data from the range you are paginating over, using curl or js sister library odata-library without the pagination, just to check for the existence of duplicates.

We need first to at least to distinguish if pyodata are returning something else from the service than other methods (therefore a bug) or is returning exactly the same things as any other API client is returning.

@victorygit
Copy link
Author

victorygit commented Jan 16, 2025 via email

@phanak-sap
Copy link
Contributor

OK. If it would be possible, could you from your investigation create a failing test that would reproduce the issue?

E.g. start with

def test_partial_listing(service):

@victorygit
Copy link
Author

victorygit commented Jan 23, 2025 via email

@phanak-sap
Copy link
Contributor

Hi Victor, I have received notification about

"Now I found it could due to the filter with lastModifiedDateTime, when I have filter as below, I got duplicated records
["lastModifiedDateTime gt datetimeoffset'1970-01-01T00:00:00Z' and lastModifiedDateTime le datetimeoffset'2025-01-26T23:34:26Z'"]
If I remove the filter, the duplication is gone. Does anyone see the similar issue? We are also use fromDate=1900-01-01 to get all the history information.
"

But I no longer seeing this comment. Is not not longer valid clue?

@phanak-sap
Copy link
Contributor

phanak-sap commented Jan 27, 2025

Also, I expect that you are using standard requests library.
For better reproducibility (e.g. writing in the end failing test with mocked responses), look what is happening under the hood by enabling logging of actual HTTP reqeusts.

If you check that the log does not contains authorization information or any sensitive data, you can attach it to the issue as well.
Similarly, it would be helpful for comparison the log from the Postman where the duplication of results does not happen.

e.g. arcticle https://proxiesapi.com/articles/logging-and-debugging-with-requests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants