-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicated Records from Odata pagination #288
Comments
Hi @victorygit this is not much actionable thing on side of the library code. I do not have access to your service nor any kind of reproducible steps. You did not provide also any python snippet "_next" pagination? skip and top?) - which would be still without the service traffic just half of the problem. I would recommend to first isolate if the problem is even on pyodata side or the duplicates truly exists in the service. The "not sure if it is due to many project changes in the system" shows me you do not know that actually. I will consider that you have only options to use the API and not working in SAP. Still, it should be possible to check if the data are really duplicates (e.g. same primary key in two paginated response) or data that simply looks like duplicate data, but are actually different in primary keys (duplicates permitted in DB). You can for example load al the data from the range you are paginating over, using curl or js sister library odata-library without the pagination, just to check for the existence of duplicates. We need first to at least to distinguish if pyodata are returning something else from the service than other methods (therefore a bug) or is returning exactly the same things as any other API client is returning. |
I can confirm we are using server side pagination ("_next"), and the
duplication does happen. I see the same records come from different page
download files. I still need test the code more closely, however I do run
the API in postman directly and there is no duplication.
Hope it helps.
Thanks
Victor
…On Thu, Jan 16, 2025, 4:24 AM Petr Hanák ***@***.***> wrote:
Hi @victorygit <https://github.com/victorygit> this is not much
actionable thing on side of the library code. I do not have access to your
service nor any kind of reproducible steps. You did not provide also any
python snippet "_next" pagination? skip and top?) - which would be still
without the service traffic just half of the problem.
I would recommend to first isolate if the problem is even on pyodata side
or the duplicates truly exists in the service. The "not sure if it is due
to many project changes in the system" shows me you do not know that
actually.
I will consider that you have only options to use the API and not working
in SAP. Still, it should be possible to check if the data are really
duplicates (e.g. same primary key in two paginated response) or data that
simply looks like duplicate data, but are actually different in primary
keys (duplicates permitted in DB).
You can for example load al the data from the range you are paginating
over, using curl or js sister library odata-library
<https://github.com/SAP/odata-library> without the pagination, just to
check for the existence of duplicates.
—
Reply to this email directly, view it on GitHub
<#288 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADXJPLKXT4MNXLZJOGAPED32K53EBAVCNFSM6AAAAABVGMR5XWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJUHE4DQMJVGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
OK. If it would be possible, could you from your investigation create a failing test that would reproduce the issue? E.g. start with python-pyodata/tests/test_service_v2.py Line 2239 in b6223d1
|
I am not sure how to recreate this issue without connect to our server,
here is what I am finding:
I am using Odata to extract entity EmpEmployment, I count the record before
I extract it and the number is 84484 which match the number when I run
$count in the postman, then I extract the data by page and add all the
pages to one dataframe, then I checked the data frame, the total count
matches, however when I check the individual records, I found the
duplication show in different page.
any other testing or setting should I try?
Thanks
Victor
…On Thu, Jan 16, 2025 at 9:20 AM Petr Hanák ***@***.***> wrote:
OK. If it would be possible, could you from your investigation create a
failing test that would reproduce the issue?
E.g. start with
https://github.com/SAP/python-pyodata/blob/b6223d1a88b85918fb0dbc9757767c880c84bcfe/tests/test_service_v2.py#L2239
—
Reply to this email directly, view it on GitHub
<#288 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADXJPLPMGXITCD7DCD77BF32K653PAVCNFSM6AAAAABVGMR5XWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJVHA2TMNBVGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi Victor, I have received notification about "Now I found it could due to the filter with lastModifiedDateTime, when I have filter as below, I got duplicated records But I no longer seeing this comment. Is not not longer valid clue? |
Also, I expect that you are using standard requests library. If you check that the log does not contains authorization information or any sensitive data, you can attach it to the issue as well. e.g. arcticle https://proxiesapi.com/articles/logging-and-debugging-with-requests |
Hello,
We are leveraging this Pyodata library to extract data from SAP successfactors, recently we found we get some duplicate data from EC module during the project. We didn't realized this for other module, not sure if it is due to many project changes in the system, do we know if we have similar issue and how to avoid these duplicated records?
We are using server side pagination and I see the same records show up in different page download.
Thanks
Victor
The text was updated successfully, but these errors were encountered: