Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

out_s3: update for getting upload time from timestamp. #6486

Merged
merged 1 commit into from
Feb 13, 2023

Conversation

Claych
Copy link
Contributor

@Claych Claych commented Nov 30, 2022

Signed-off-by: Clay Cheng [email protected]


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
[INPUT]
     Name                         forward
     Listen                       0.0.0.0
     Port                         24224
[OUTPUT]
     Name s3
     Match *
     bucket s3-time-1
     region us-east-1
     use_put_object On
     total_file_size 10M
     retry_limit 2
     upload_timeout 5s
{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 100000,
            "batchSize": 1
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "logStream": "stderr",
            "port": 24224
        }
    },
    "stage": {
        "batchRate": 5,
        "maxBatches": 1000
    }
}
  • Debug log output from testing the change
    Upload timeout + put object used
[INPUT]
     Name                         forward
     Listen                       0.0.0.0
     Port                         24224

[OUTPUT]
     Name s3
     Match *
     bucket s3-time-test
     region us-east-1
#     use_put_object On
     total_file_size 50M
     retry_limit 2
     upload_timeout 10s
{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 1000
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1,
            "timeOffset": -100
        }
    },
    "stage": {
        "batchRate": 5,
        "maxBatches": 10000000
    }
}
[2023/01/04 00:05:45] [ info] [output:s3:s3.0] worker #0 started
[2023/01/04 00:05:45] [ info] [sp] stream processor started
[2023/01/04 00:06:38] [ info] [output:s3:s3.0] upload_timeout reached for tag_prefix
[2023/01/04 00:06:39] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2023/01/04/00/04/47-objecth4lHpX0O
[2023/01/04 00:06:50] [ info] [output:s3:s3.0] upload_timeout reached for tag_prefix
[2023/01/04 00:06:51] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2023/01/04/00/04/58-objectr2dk4WDH
[2023/01/04 00:07:02] [ info] [output:s3:s3.0] upload_timeout reached for tag_prefix
[2023/01/04 00:07:03] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2023/01/04/00/05/10-objectd1yH71IC
[2023/01/04 00:07:14] [ info] [output:s3:s3.0] upload_timeout reached for tag_prefix
[2023/01/04 00:07:15] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2023/01/04/00/05/23-objectLGJCZYcW

first log time and key format
image

NO Upload timeout + Multipart

[INPUT]
     Name                         forward
     Listen                       0.0.0.0
     Port                         24224

[OUTPUT]
     Name s3
     Match *
     bucket s3-time-test
     region us-east-1
#     use_put_object On
     total_file_size 50M
     retry_limit 2
     upload_timeout 10s
{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 100000
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1,
            "timeOffset": -100
        }
    },
    "stage": {
        "batchRate": 5,
        "maxBatches": 10000000
    }
}
[2023/01/04 00:19:17] [ info] [output:s3:s3.0] UploadPart http status=200
[2023/01/04 00:19:17] [ info] [output:s3:s3.0] Successfully uploaded part #2 for /fluent-bit-logs/tag_prefix/2023/01/04/00/17/33, UploadId=mdox3F3JsR80CA1R4eDhz3Yp88AM25FyJKxlECaahYpoQwlnQlTgEpF3FPkc8AYk4p_2Cpmi0uNaQrM.adNlDYQ8ahBcU6ToAZBex47ht7GZINBdAY_xwYp0VFmit0ZG, ETag=78af9cbb127bf654817f0f8273fe7bdb
[2023/01/04 00:19:18] [ info] [output:s3:s3.0] UploadPart http status=200
[2023/01/04 00:19:18] [ info] [output:s3:s3.0] Successfully uploaded part #3 for /fluent-bit-logs/tag_prefix/2023/01/04/00/17/33, UploadId=mdox3F3JsR80CA1R4eDhz3Yp88AM25FyJKxlECaahYpoQwlnQlTgEpF3FPkc8AYk4p_2Cpmi0uNaQrM.adNlDYQ8ahBcU6ToAZBex47ht7GZINBdAY_xwYp0VFmit0ZG, ETag=9c2bd3405a7011ce397e382bd54fb88c
[2023/01/04 00:19:19] [ info] [output:s3:s3.0] UploadPart http status=200
[2023/01/04 00:19:19] [ info] [output:s3:s3.0] Successfully uploaded part #4 for /fluent-bit-logs/tag_prefix/2023/01/04/00/17/33, UploadId=mdox3F3JsR80CA1R4eDhz3Yp88AM25FyJKxlECaahYpoQwlnQlTgEpF3FPkc8AYk4p_2Cpmi0uNaQrM.adNlDYQ8ahBcU6ToAZBex47ht7GZINBdAY_xwYp0VFmit0ZG, ETag=705e5449d950fa76e99b0f40c49a7bfc
[2023/01/04 00:19:21] [ info] [output:s3:s3.0] UploadPart http status=200

first log time and key format
image

Upload timeout + multipart

[INPUT]
     Name                         forward
     Listen                       0.0.0.0
     Port                         24224

[OUTPUT]
     Name s3
     Match *
     bucket s3-time-test
     region us-east-1
#     use_put_object On
     total_file_size 6G
     retry_limit 2
     upload_timeout 10s
{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 100000
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1,
            "timeOffset": -100
        }
    },
    "stage": {
        "batchRate": 5,
        "maxBatches": 10000000
    }
}
[2023/01/04 00:20:21] [ info] [sp] stream processor started
[2023/01/04 00:20:21] [ info] [output:s3:s3.0] worker #0 started
[2023/01/04 00:20:54] [ info] [output:s3:s3.0] Successfully initiated multipart upload for /fluent-bit-logs/tag_prefix/2023/01/04/00/19/12, UploadId=pRC2DikdUL6ZEzWPYCR7WLE.JJnJgdizkPFV8AoJ8JTnFnRAG9Kj2gKJQnG1DPMgV8cdJQ13ly.2M_z5X02gFnMGF8yuI8MnI5ZRCJMXoEEmHJG7RlwycF04zUccNAOq
[2023/01/04 00:20:54] [ info] [output:s3:s3.0] UploadPart http status=200
[2023/01/04 00:20:54] [ info] [output:s3:s3.0] Successfully uploaded part #1 for /fluent-bit-logs/tag_prefix/2023/01/04/00/19/12, UploadId=pRC2DikdUL6ZEzWPYCR7WLE.JJnJgdizkPFV8AoJ8JTnFnRAG9Kj2gKJQnG1DPMgV8cdJQ13ly.2M_z5X02gFnMGF8yuI8MnI5ZRCJMXoEEmHJG7RlwycF04zUccNAOq, ETag=3d3e0fa00c7aa5ac7d48b7412d37c1fa
[2023/01/04 00:20:55] [ info] [output:s3:s3.0] UploadPart http status=200
[2023/01/04 00:20:55] [ info] [output:s3:s3.0] Successfully uploaded part #2 for /fluent-bit-logs/tag_prefix/2023/01/04/00/19/12, UploadId=pRC2DikdUL6ZEzWPYCR7WLE.JJnJgdizkPFV8AoJ8JTnFnRAG9Kj2gKJQnG1DPMgV8cdJQ13ly.2M_z5X02gFnMGF8yuI8MnI5ZRCJMXoEEmHJG7RlwycF04zUccNAOq, ETag=9dd63db9450ad383a90a4dc77fbbaa67
[2023/01/04 00:20:57] [ info] [output:s3:s3.0] UploadPart http status=200
[2023/01/04 00:20:57] [ info] [output:s3:s3.0] Successfully uploaded part #3 for /fluent-bit-logs/tag_prefix/2023/01/04/00/19/12, UploadId=pRC2DikdUL6ZEzWPYCR7WLE.JJnJgdizkPFV8AoJ8JTnFnRAG9Kj2gKJQnG1DPMgV8cdJQ13ly.2M_z5X02gFnMGF8yuI8MnI5ZRCJMXoEEmHJG7RlwycF04zUccNAOq, ETag=ec14e35d5f7ab37780a029d63b536bec
[2023/01/04 00:20:58] [ info] [output:s3:s3.0] UploadPart http status=200
[2023/01/04 00:20:58] [ info] [output:s3:s3.0] Successfully uploaded part #4 for /fluent-bit-logs/tag_prefix/2023/01/04/00/19/12, UploadId=pRC2DikdUL6ZEzWPYCR7WLE.JJnJgdizkPFV8AoJ8JTnFnRAG9Kj2gKJQnG1DPMgV8cdJQ13ly.2M_z5X02gFnMGF8yuI8MnI5ZRCJMXoEEmHJG7RlwycF04zUccNAOq, ETag=b95a4fc49cb23d3521978deafcaed2f3
[2023/01/04 00:20:59] [ info] [output:s3:s3.0] UploadPart http status=200
[2023/01/04 00:20:59] [ info] [output:s3:s3.0] Successfully uploaded part #5 for /fluent-bit-logs/tag_prefix/2023/01/04/00/19/12, UploadId=pRC2DikdUL6ZEzWPYCR7WLE.JJnJgdizkPFV8AoJ8JTnFnRAG9Kj2gKJQnG1DPMgV8cdJQ13ly.2M_z5X02gFnMGF8yuI8MnI5ZRCJMXoEEmHJG7RlwycF04zUccNAOq, ETag=2f579891b3254356e78b69da5de94488

first log time and key format
image

No Upload Timeout + Single put object

[INPUT]
     Name                         forward
     Listen                       0.0.0.0
     Port                         24224

[OUTPUT]
     Name s3
     Match *
     bucket s3-time-test
     region us-east-1
#     use_put_object On
     total_file_size 1M
     retry_limit 2
     upload_timeout 10s
{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 1000000
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1,
            "timeOffset": -100
        }
    },
    "stage": {
        "batchRate": 5,
        "maxBatches": 10000000
    }
}
[2023/01/04 00:22:24] [ info] [sp] stream processor started
[2023/01/04 00:22:24] [ info] [output:s3:s3.0] worker #0 started
[2023/01/04 00:22:36] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2023/01/04/00/20/55-objectVubURfP2
[2023/01/04 00:22:36] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2023/01/04/00/20/55-object08fXmltw
[2023/01/04 00:22:36] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2023/01/04/00/20/55-objectMPpfA1ef
[2023/01/04 00:22:37] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2023/01/04/00/20/55-objectzZkbV6ul
[2023/01/04 00:22:37] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2023/01/04/00/20/56-objectVgFIjmn8
[2023/01/04 00:22:37] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2023/01/04/00/20/56-objectqx3KzVt3
[2023/01/04 00:22:37] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2023/01/04/00/20/56-object6nv6J7f4
[2023/01/04 00:22:37] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2023/01/04/00/20/56-objectpUwoeaFc

first log time and time format
image

  • Attached Valgrind output that shows no leaks or memory corruption was found
==10338== LEAK SUMMARY:
==10338==    definitely lost: 0 bytes in 0 blocks
==10338==    indirectly lost: 0 bytes in 0 blocks
==10338==      possibly lost: 0 bytes in 0 blocks
==10338==    still reachable: 102,912 bytes in 3,432 blocks
==10338==         suppressed: 0 bytes in 0 blocks

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

PettitWesley
PettitWesley previously approved these changes Nov 30, 2022
@Claych Claych temporarily deployed to pr November 30, 2022 20:07 Inactive
@Claych Claych temporarily deployed to pr November 30, 2022 20:07 Inactive
@Claych Claych temporarily deployed to pr November 30, 2022 20:25 Inactive
@Claych Claych force-pushed the clay-s3-time-stamp-1.9 branch from 8e8f3e7 to 763583e Compare December 2, 2022 18:20
matthewfala
matthewfala previously approved these changes Dec 2, 2022
create_time = time(NULL);
flb_plg_error(ctx->ins, "Failed to get timestamp used for the S3 key, "
Copy link
Contributor

@PettitWesley PettitWesley Dec 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make it clear we tried to get the timestamp from the log record(s).

m_upload->init_time = time(NULL);
if (tms == NULL) {
m_upload->init_time = time(NULL);
flb_plg_error(ctx->ins, "Failed to get timestamp used for the S3 key, "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above

@PettitWesley
Copy link
Contributor

@Claych

@matthewfala note from discussion today of we need to double check:

We should not be setting the timestamp each time. We want the first timestamp in the chunk file, so the first flush that created that chunk file. The way you can do this is only set the create_time if it is not already set on the chunk.

Comment on lines 2138 to 2151
/* unpack msgpack */
msgpack_unpacked_init(&result);

/* Get the first record timestamp */
while (msgpack_unpack_next(&result,
event_chunk->data,
event_chunk->size, &off) == MSGPACK_UNPACK_SUCCESS) {
flb_time_pop_from_msgpack(&tms, &result, &obj);
if (&tms.tm.tv_sec != 0) {
break;
}
}

msgpack_unpacked_destroy(&result);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need to get timestamp for first flush per chunk S3_file. So, we can run this code only if there is no existing chunk S3_file for this tag...

Comment on lines 1550 to 1557
if (tms == NULL) {
m_upload->init_time = time(NULL);
flb_plg_error(ctx->ins, "Failed to get timestamp from the log record "
"used for the S3 key, use the current time instead.");
}
else {
m_upload->init_time = tms->tm.tv_sec;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Claych Further up in this code we create the S3 key, this is where the init_time timestamp needs to be used... right now we are still using time(NULL)... please address this and test it.

@matthewfala
Copy link
Contributor

matthewfala commented Dec 6, 2022

Tested both multipart and put object uploads. Both work. The files have a name that reflects the first log's timestamp

Multi Part upload

{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 100000
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1
        }
    },
    "stage": {
        "batchRate": 20,
        "maxBatches": 10000000
    }
}

Single put upload

{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 100
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1
        }
    },
    "stage": {
        "batchRate": 20,
        "maxBatches": 10000000
    }
}

@matthewfala
Copy link
Contributor

matthewfala commented Dec 7, 2022

PR Upload Timeout Issue

There may be an issue with this PR related to upload timeouts which needs resolving.
Regression testing is needed to determine if the problem is introduced with this PR.

Basic Test Case:

Fluent Bit Config

[INPUT]
     Name                         forward
     Listen                       0.0.0.0
     Port                         24224

[OUTPUT]
     Name s3
     Match *
     bucket fluent-bit-3
     region us-west-2
#     use_put_object On
     total_file_size 50M
     retry_limit 2
     upload_timeout 10s

FireLens Datajet Config

The hot-fix-wppttt branch should be used for this test case as it includes the timeOffset config option


{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 100000
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1,
            "timeOffset": -100
        }
    },
    "stage": {
        "batchRate": 5,
        "maxBatches": 10000000
    }
}

The following message appears many times per second (once per batch):

[2022/12/07 21:22:05] [ info] [output:s3:s3.0] upload_timeout reached for tag_prefix

However this should not be, because upload_timeout should only be reached once per 10 seconds.

Test Case Trigger

Note how the test case has "timeOffset" set to -100. That means that the logs are set to the current time -100 seconds.

Root Cause Analysis

The root cause is the following lines of code along with the PR.
https://github.com/Claych/fluent-bit/blob/b550c69d95b6cc6006ec004a1222a5539f7b8e4b/plugins/out_s3/s3.c#L2201-L2207

This section may be okay because it uses init_time, but further investigation is needed to determine if this is correct or not.
https://github.com/Claych/fluent-bit/blob/b550c69d95b6cc6006ec004a1222a5539f7b8e4b/plugins/out_s3/s3.c#L2212-L2216

Note how determining if the time out has occurred, we compare current time to the upload_file->create_time. This is an issue because this PR changes the definition of create_time from the time when the file was created (which is the obvious meaning) to the time of the first log (which can be far back in history and should not be used for upload timeouts.)

Resolution

To resolve this issue we need a separate variable to hold the file's actual creation time vs the time of the first log which are two different things. We need to use actual creation time for the upload timeout detection

Further Investigation

We need to further investigate where create_time is used. If changing the meaning of create_time obscures anything else about the operation of the S3 plugin.

Validation & Regression testing

Further validation is needed to determine if the issue has been introduced with this PR. Testing is currently being done to determine this.

Here's the logs when Clay's PR is not included. Note how there is no message of the upload timeout. Also note how multipart upload is used.

[2022/12/07 22:04:57] [ info] [output:s3:s3.0] Successfully completed multipart upload for /fluent-bit-logs/tag_prefix/2022/12/07/22/04/45, UploadId=NrvxaWInEiX1vAJGWfHSxCAxhIJ9D8r0V3u.Z34WkPIOdnscxBMZZKI7GXLJhBI5Puw0nlzmxY5nPL97On2VTdKeMWbOx5wG_caDiMfJ9Odmv7lUEeTYizfwrxx4lYxp
[2022/12/07 22:04:57] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:04:57] [ info] [output:s3:s3.0] Successfully uploaded part #2 for /fluent-bit-logs/tag_prefix/2022/12/07/22/04/55, UploadId=T43TMXLQ_Ubq4I1VPwdullwU3WA2cwvUaCmhzcPdNO03uzSZWfGrAz7h4JMJ4GOVcSXIYjdAq48Dy4QbG.5.s1vWMEvPStKD4Z13C9gch88D4nOHkMj9pE.Mt3XhOuZm, ETag=9a5aa13adf7e27b983f4de0970ec3a1d
[2022/12/07 22:04:58] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:04:58] [ info] [output:s3:s3.0] Successfully uploaded part #3 for /fluent-bit-logs/tag_prefix/2022/12/07/22/04/55, UploadId=T43TMXLQ_Ubq4I1VPwdullwU3WA2cwvUaCmhzcPdNO03uzSZWfGrAz7h4JMJ4GOVcSXIYjdAq48Dy4QbG.5.s1vWMEvPStKD4Z13C9gch88D4nOHkMj9pE.Mt3XhOuZm, ETag=cff63c3a8124e51aacfa988fd536e863

This indicates that this PR

  1. Tampers with multipart being used in certain cases.
  2. Triggers uploads before they are appropriate.

Further Test Cases

4 Test cases need to be created to test the resolution of this issue

  1. Multipart (with and without upload timeout)
  2. Put Object (with and without upload timeout)
  3. Multipart upload on s3 plugin initialization (hard to write a test case for)
  4. Put Object upload on s3 plugin initialization (hard to write a test case for)

Test cases

The following test cases outline tests and expected results, as found in 1.9.10 without the patch.

Upload timeout + put object used

[INPUT]
     Name                         forward
     Listen                       0.0.0.0
     Port                         24224

[OUTPUT]
     Name s3
     Match *
     bucket fluent-bit-3
     region us-west-2
#     use_put_object On
     total_file_size 50M
     retry_limit 2
     upload_timeout 10s
{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 1000
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1,
            "timeOffset": -100
        }
    },
    "stage": {
        "batchRate": 5,
        "maxBatches": 10000000
    }
}
[2022/12/07 22:09:14] [ info] [output:s3:s3.0] upload_timeout reached for tag_prefix
[2022/12/07 22:09:14] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2022/12/07/22/09/03-objectnAGP5YUW
[2022/12/07 22:09:26] [ info] [output:s3:s3.0] upload_timeout reached for tag_prefix
[2022/12/07 22:09:26] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2022/12/07/22/09/15-object0D8Aq4W1
[2022/12/07 22:09:38] [ info] [output:s3:s3.0] upload_timeout reached for tag_prefix

NO Upload timeout + Multipart (failure case)


[INPUT]
     Name                         forward
     Listen                       0.0.0.0
     Port                         24224

[OUTPUT]
     Name s3
     Match *
     bucket fluent-bit-3
     region us-west-2
#     use_put_object On
     total_file_size 50M
     retry_limit 2
     upload_timeout 10s
{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 100000
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1,
            "timeOffset": -100
        }
    },
    "stage": {
        "batchRate": 5,
        "maxBatches": 10000000
    }
}
[2022/12/07 22:34:45] [ info] [output:s3:s3.0] Will complete upload for /fluent-bit-logs/tag_prefix/2022/12/07/22/34/36 because uploaded data is greater than size set by total_file_size
[2022/12/07 22:34:46] [ info] [output:s3:s3.0] Successfully initiated multipart upload for /fluent-bit-logs/tag_prefix/2022/12/07/22/34/46, UploadId=QCtkemb3uz8vKwj4s04rxeMnixFAZCYEcqAFMvsPyB82zLAk_XQz8FoyuVZfYCpKIdmMxd6xvEOMVgTLmBhSbSpcyQMA4uouf3b5q56v_TlohzIYsutZk3EGYP__ebgm
[2022/12/07 22:34:46] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:34:46] [ info] [output:s3:s3.0] Successfully uploaded part #1 for /fluent-bit-logs/tag_prefix/2022/12/07/22/34/46, UploadId=QCtkemb3uz8vKwj4s04rxeMnixFAZCYEcqAFMvsPyB82zLAk_XQz8FoyuVZfYCpKIdmMxd6xvEOMVgTLmBhSbSpcyQMA4uouf3b5q56v_TlohzIYsutZk3EGYP__ebgm, ETag=97383dbcecb8aecf7314566e906e2408
[2022/12/07 22:34:48] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:34:48] [ info] [output:s3:s3.0] Successfully uploaded part #2 for /fluent-bit-logs/tag_prefix/2022/12/07/22/34/46, UploadId=QCtkemb3uz8vKwj4s04rxeMnixFAZCYEcqAFMvsPyB82zLAk_XQz8FoyuVZfYCpKIdmMxd6xvEOMVgTLmBhSbSpcyQMA4uouf3b5q56v_TlohzIYsutZk3EGYP__ebgm, ETag=136f070c76fce5593e02156c4e0e3691
[2022/12/07 22:34:49] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:34:49] [ info] [output:s3:s3.0] Successfully uploaded part #3 for /fluent-bit-logs/tag_prefix/2022/12/07/22/34/46, UploadId=QCtkemb3uz8vKwj4s04rxeMnixFAZCYEcqAFMvsPyB82zLAk_XQz8FoyuVZfYCpKIdmMxd6xvEOMVgTLmBhSbSpcyQMA4uouf3b5q56v_TlohzIYsutZk3EGYP__ebgm, ETag=fe8e715649ef7537d6d6f5218b784047
[2022/12/07 22:34:50] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:34:50] [ info] [output:s3:s3.0] Successfully uploaded part #4 for /fluent-bit-logs/tag_prefix/2022/12/07/22/34/46, UploadId=QCtkemb3uz8vKwj4s04rxeMnixFAZCYEcqAFMvsPyB82zLAk_XQz8FoyuVZfYCpKIdmMxd6xvEOMVgTLmBhSbSpcyQMA4uouf3b5q56v_TlohzIYsutZk3EGYP__ebgm, ETag=dafd7fb80176b647184aaea0b7ff9b57
[2022/12/07 22:34:51] [ info] [output:s3:s3.0] Completing upload for /fluent-bit-logs/tag_prefix/2022/12/07/22/34/36 because upload_timeout has passed
[2022/12/07 22:34:51] [ info] [output:s3:s3.0] Successfully completed multipart upload for /fluent-bit-logs/tag_prefix/2022/12/07/22/34/36, UploadId=3waLx61SCKW8kmPPlF1YJDiLHk0FC367C7AsZZISepNjdHAwF9l8RzTS9MfNKXlT5gYoG69nci6A.keIFvH82RkCFNpkUW2.HQfCZgAZSkI.EflNg0CbXrBTPuA1NPhk
[2022/12/07 22:34:52] [ info] [output:s3:s3.0] UploadPart http status=200

Upload timeout + multipart

[INPUT]
     Name                         forward
     Listen                       0.0.0.0
     Port                         24224

[OUTPUT]
     Name s3
     Match *
     bucket fluent-bit-3
     region us-west-2
#     use_put_object On
     total_file_size 6G
     retry_limit 2
     upload_timeout 10s
{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 100000
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1,
            "timeOffset": -100
        }
    },
    "stage": {
        "batchRate": 5,
        "maxBatches": 10000000
    }
}
[2022/12/07 22:39:23] [ info] [output:s3:s3.0] Will complete upload for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/12 because upload_timeout has elapsed
[2022/12/07 22:39:24] [ info] [output:s3:s3.0] Completing upload for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/12 because upload_timeout has passed
[2022/12/07 22:39:24] [ info] [output:s3:s3.0] Successfully completed multipart upload for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/12, UploadId=xGEp8NbGj8KdIh5GNzeU354219W_pqWQp4ZYcc2m01BmoDh3qnuB.TVc6dDXgoAoCsTdRDXfzMxw4P9key6Z_Ss9kN8ciWs_SrRaIeo6nuipeCuRGBLV96vxozPyyZFT
[2022/12/07 22:39:24] [ info] [output:s3:s3.0] Successfully initiated multipart upload for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/24, UploadId=HD.kCTGwrB5Mwbe2yI3x7flNMMGCFEeCB.Mz3zyvVcXwN3OLrFUZbskTNN.5EaXCLKt0VmeUamyIrNvnW5j2QCg7PqaKYH6G_lUZNEQM7RyTpo0IwUm8458gABi7KS_H
[2022/12/07 22:39:24] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:39:24] [ info] [output:s3:s3.0] Successfully uploaded part #1 for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/24, UploadId=HD.kCTGwrB5Mwbe2yI3x7flNMMGCFEeCB.Mz3zyvVcXwN3OLrFUZbskTNN.5EaXCLKt0VmeUamyIrNvnW5j2QCg7PqaKYH6G_lUZNEQM7RyTpo0IwUm8458gABi7KS_H, ETag=e5d9e863edfde1ed96b31465df3053c8
[2022/12/07 22:39:25] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:39:25] [ info] [output:s3:s3.0] Successfully uploaded part #2 for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/24, UploadId=HD.kCTGwrB5Mwbe2yI3x7flNMMGCFEeCB.Mz3zyvVcXwN3OLrFUZbskTNN.5EaXCLKt0VmeUamyIrNvnW5j2QCg7PqaKYH6G_lUZNEQM7RyTpo0IwUm8458gABi7KS_H, ETag=5ae5d4a26c3f51225c08ae14784b589e
[2022/12/07 22:39:27] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:39:27] [ info] [output:s3:s3.0] Successfully uploaded part #3 for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/24, UploadId=HD.kCTGwrB5Mwbe2yI3x7flNMMGCFEeCB.Mz3zyvVcXwN3OLrFUZbskTNN.5EaXCLKt0VmeUamyIrNvnW5j2QCg7PqaKYH6G_lUZNEQM7RyTpo0IwUm8458gABi7KS_H, ETag=204ff6344f022eee04a37ee5b8b53008
[2022/12/07 22:39:28] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:39:28] [ info] [output:s3:s3.0] Successfully uploaded part #4 for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/24, UploadId=HD.kCTGwrB5Mwbe2yI3x7flNMMGCFEeCB.Mz3zyvVcXwN3OLrFUZbskTNN.5EaXCLKt0VmeUamyIrNvnW5j2QCg7PqaKYH6G_lUZNEQM7RyTpo0IwUm8458gABi7KS_H, ETag=94bbdff97d33b765ff23dfa257260029
[2022/12/07 22:39:29] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:39:29] [ info] [output:s3:s3.0] Successfully uploaded part #5 for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/24, UploadId=HD.kCTGwrB5Mwbe2yI3x7flNMMGCFEeCB.Mz3zyvVcXwN3OLrFUZbskTNN.5EaXCLKt0VmeUamyIrNvnW5j2QCg7PqaKYH6G_lUZNEQM7RyTpo0IwUm8458gABi7KS_H, ETag=969d891a2e130eadf74f53f080c33e20
[2022/12/07 22:39:31] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:39:31] [ info] [output:s3:s3.0] Successfully uploaded part #6 for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/24, UploadId=HD.kCTGwrB5Mwbe2yI3x7flNMMGCFEeCB.Mz3zyvVcXwN3OLrFUZbskTNN.5EaXCLKt0VmeUamyIrNvnW5j2QCg7PqaKYH6G_lUZNEQM7RyTpo0IwUm8458gABi7KS_H, ETag=90553fb1186e40904e40a708431dd779
[2022/12/07 22:39:32] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:39:32] [ info] [output:s3:s3.0] Successfully uploaded part #7 for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/24, UploadId=HD.kCTGwrB5Mwbe2yI3x7flNMMGCFEeCB.Mz3zyvVcXwN3OLrFUZbskTNN.5EaXCLKt0VmeUamyIrNvnW5j2QCg7PqaKYH6G_lUZNEQM7RyTpo0IwUm8458gABi7KS_H, ETag=67c7a798c8c67c6217f59d5c2e34b619
[2022/12/07 22:39:33] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:39:33] [ info] [output:s3:s3.0] Successfully uploaded part #8 for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/24, UploadId=HD.kCTGwrB5Mwbe2yI3x7flNMMGCFEeCB.Mz3zyvVcXwN3OLrFUZbskTNN.5EaXCLKt0VmeUamyIrNvnW5j2QCg7PqaKYH6G_lUZNEQM7RyTpo0IwUm8458gABi7KS_H, ETag=79f218281edf60f97c4d286a84c1cee0
[2022/12/07 22:39:35] [ info] [output:s3:s3.0] upload_timeout reached for tag_prefix
[2022/12/07 22:39:35] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:39:35] [ info] [output:s3:s3.0] Successfully uploaded part #9 for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/24, UploadId=HD.kCTGwrB5Mwbe2yI3x7flNMMGCFEeCB.Mz3zyvVcXwN3OLrFUZbskTNN.5EaXCLKt0VmeUamyIrNvnW5j2QCg7PqaKYH6G_lUZNEQM7RyTpo0IwUm8458gABi7KS_H, ETag=8472bdcd3e958809163cb7dc6e62b276
[2022/12/07 22:39:35] [ info] [output:s3:s3.0] Will complete upload for /fluent-bit-logs/tag_prefix/2022/12/07/22/39/24 because upload_timeout has elapsed

No Upload Timeout + Single put object

[INPUT]
     Name                         forward
     Listen                       0.0.0.0
     Port                         24224

[OUTPUT]
     Name s3
     Match *
     bucket fluent-bit-3
     region us-west-2
#     use_put_object On
     total_file_size 1M
     retry_limit 2
     upload_timeout 10s
{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 1000000
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1,
            "timeOffset": -100
        }
    },
    "stage": {
        "batchRate": 5,
        "maxBatches": 10000000
    }
}
[2022/12/07 22:46:22] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2022/12/07/22/46/22-objectfCox2dUK
[2022/12/07 22:46:22] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2022/12/07/22/46/22-object6pFZX3VN
[2022/12/07 22:46:23] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2022/12/07/22/46/22-objectw4Qml2HZ
[2022/12/07 22:46:23] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2022/12/07/22/46/22-objectkoVBSH6T
[2022/12/07 22:46:23] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2022/12/07/22/46/22-object1WqcRC3x

Initialization test cases

Multipart upload on s3 plugin initialization (hard to write a test case for)

Run the following for 10 seconds:

[INPUT]
     Name                         forward
     Listen                       0.0.0.0
     Port                         24224

[OUTPUT]
     Name s3
     Match *
     bucket fluent-bit-3
     region us-west-2
#     use_put_object On
     total_file_size 5G
     retry_limit 2
     upload_timeout 10000s
{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 1000000
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1,
            "timeOffset": -100
        }
    },
    "stage": {
        "batchRate": 5,
        "maxBatches": 10000000
    }
}

Stop firelens datajet. Restart fluent bit
Expect the following logs

[2022/12/07 22:51:06] [ info] [output:s3:s3.0] worker #0 started
[2022/12/07 22:51:07] [ info] [output:s3:s3.0] Successfully initiated multipart upload for /fluent-bit-logs/tag_prefix/2022/12/07/22/51/07, UploadId=mGkfQrKQzmswKYeWF4En3htof0C5koruODKkSBitD7PZU6qUwkI.6eGYKXihixjouhb1gU3H5nD24TtHywCu1TPU6mE.cDrUaG0QpU9WPNht7BNLHI5NlGHcFgvm7HKK
[2022/12/07 22:51:07] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:51:07] [ info] [output:s3:s3.0] Successfully uploaded part #1 for /fluent-bit-logs/tag_prefix/2022/12/07/22/51/07, UploadId=mGkfQrKQzmswKYeWF4En3htof0C5koruODKkSBitD7PZU6qUwkI.6eGYKXihixjouhb1gU3H5nD24TtHywCu1TPU6mE.cDrUaG0QpU9WPNht7BNLHI5NlGHcFgvm7HKK, ETag=03e52c7bf1ef072d18978ed03dd38a23
[2022/12/07 22:51:07] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:51:07] [ info] [output:s3:s3.0] Successfully uploaded part #2 for /fluent-bit-logs/tag_prefix/2022/12/07/22/51/07, UploadId=mGkfQrKQzmswKYeWF4En3htof0C5koruODKkSBitD7PZU6qUwkI.6eGYKXihixjouhb1gU3H5nD24TtHywCu1TPU6mE.cDrUaG0QpU9WPNht7BNLHI5NlGHcFgvm7HKK, ETag=bd23fcb78970a7846e06d6933250a771
[2022/12/07 22:51:07] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:51:07] [ info] [output:s3:s3.0] Successfully uploaded part #3 for /fluent-bit-logs/tag_prefix/2022/12/07/22/51/07, UploadId=mGkfQrKQzmswKYeWF4En3htof0C5koruODKkSBitD7PZU6qUwkI.6eGYKXihixjouhb1gU3H5nD24TtHywCu1TPU6mE.cDrUaG0QpU9WPNht7BNLHI5NlGHcFgvm7HKK, ETag=c6020cc793ce0d29ba8ef92cf8c85918
[2022/12/07 22:51:08] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:51:08] [ info] [output:s3:s3.0] Successfully uploaded part #4 for /fluent-bit-logs/tag_prefix/2022/12/07/22/51/07, UploadId=mGkfQrKQzmswKYeWF4En3htof0C5koruODKkSBitD7PZU6qUwkI.6eGYKXihixjouhb1gU3H5nD24TtHywCu1TPU6mE.cDrUaG0QpU9WPNht7BNLHI5NlGHcFgvm7HKK, ETag=c6020cc793ce0d29ba8ef92cf8c85918
[2022/12/07 22:51:08] [ info] [output:s3:s3.0] UploadPart http status=200
[2022/12/07 22:51:08] [ info] [output:s3:s3.0] Successfully uploaded part #5 for /fluent-bit-logs/tag_prefix/2022/12/07/22/51/07, UploadId=mGkfQrKQzmswKYeWF4En3htof0C5koruODKkSBitD7PZU6qUwkI.6eGYKXihixjouhb1gU3H5nD24TtHywCu1TPU6mE.cDrUaG0QpU9WPNht7BNLHI5NlGHcFgvm7HKK, ETag=c6020cc793ce0d29ba8ef92cf8c85918

Put Object upload on s3 plugin initialization (hard to write a test case for)

Run the following for 10 seconds:

{
    "generator": {
        "name": "basic",
        "config": {
            "contentLength": 100
        }
    },
    "datajet": {
        "name": "forward",
        "config": {
            "batchSize": 1,
            "timeOffset": -100
        }
    },
    "stage": {
        "batchRate": 5,
        "maxBatches": 10000000
    }
}
[INPUT]
     Name                         forward
     Listen                       0.0.0.0
     Port                         24224

[OUTPUT]
     Name s3
     Match *
     bucket fluent-bit-3
     region us-west-2
#     use_put_object On
     total_file_size 5G
     retry_limit 2
     upload_timeout 10000s

Stop firelens datajet. Restart fluent bit
Expect the following logs

* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2022/12/07 22:56:14] [ info] [fluent bit] version=2.0.0, commit=f597b2f913, pid=6195
[2022/12/07 22:56:14] [ info] [storage] ver=1.2.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2022/12/07 22:56:14] [ info] [cmetrics] version=0.5.2
[2022/12/07 22:56:14] [ info] [input:forward:forward.0] initializing
[2022/12/07 22:56:14] [ info] [input:forward:forward.0] storage_strategy='memory' (memory only)
[2022/12/07 22:56:14] [ info] [input:forward:forward.0] listening on 0.0.0.0:24224
[2022/12/07 22:56:14] [ info] [output:s3:s3.0] Using upload size 5000000000 bytes
[2022/12/07 22:56:14] [ info] [output:s3:s3.0] Sending locally buffered data from previous executions to S3; buffer=/tmp/fluent-bit/s3/fluent-bit-3
[2022/12/07 22:56:14] [ info] [output:s3:s3.0] Successfully uploaded object /fluent-bit-logs/tag_prefix/2022/12/07/22/56/14-objectNqROFHkf
[2022/12/07 22:56:14] [ info] [sp] stream processor started
[2022/12/07 22:56:14] [ info] [output:s3:s3.0] worker #0 started

Comment on lines 1550 to 1557
if (timep == NULL) {
m_upload->init_time = time(NULL);
flb_plg_error(ctx->ins, "Failed to get timestamp from the log record "
"used for the S3 key, use the current time instead.");
}
else {
m_upload->init_time = timep;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you set init_time to timep which is the first_log_timestamp?
This will cause more timeout related errors see

fluent-bit/plugins/out_s3/s3.c

Lines 2161 to 2165 in 760956f

if (m_upload_file != NULL && time(NULL) >
(m_upload_file->init_time + ctx->upload_timeout)) {
upload_timeout_check = FLB_TRUE;
flb_plg_info(ctx->ins, "upload_timeout reached for %s", event_chunk->tag);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return to origin

Comment on lines 1058 to 1062
else {
else if (timep == NULL) {
create_time = time(NULL);
flb_plg_error(ctx->ins, "Failed to get timestamp from the log record "
"used for the S3 key, use the current time instead.");
}
else {
create_time = timep;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is create_time being set to timep? timep is the first_log_time. It has nothing to do with the file's create_time which is used by the timers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted.

@@ -50,13 +50,14 @@ static int s3_put_object(struct flb_s3 *ctx, const char *tag, time_t create_time

static int put_all_chunks(struct flb_s3 *ctx);

static void cb_s3_upload(struct flb_config *ctx, void *data);
static void cb_s3_upload(struct flb_config *ctx, void *data, time_t *timep);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you could change all the timep s to something more descriptive like "first_log_time" that would help greatly to clarify your code and logic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, could you rename each to either:

chunk_first_log_time

or

file_first_log_time

Right now it's really confusing whether you're talking about the chunk's log timestamp (for a single flush) or the file's log timestamp (for multiple flushes) I'm having trouble following the logic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will rename

Copy link
Contributor

@matthewfala matthewfala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think part of the problem with the logic is that the variable names are fuzzy so it's hard to follow what the code is doing. I have some suggestions on clearer variable names that should help.

Comment on lines 2271 to 2239
if (!upload_file) {
ret = send_upload_request(ctx, chunk, upload_file,
m_upload_file,event_chunk->tag,
flb_sds_len(event_chunk->tag), &tms.tm.tv_sec);
}
else {
ret = send_upload_request(ctx, chunk, upload_file,
m_upload_file,event_chunk->tag,
flb_sds_len(event_chunk->tag),
upload_file->first_log_time);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To simplify logic can you always pass in chunk_first_log_time to this function, and have the function decide whether to use the upload_file's time or the current time if upload_file is null or not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you should always just do:

send_upload_request(ctx, chunk, upload_file, m_upload_file,
                                      event_chunk->tag,
                                      flb_sds_len(event_chunk->tag),
                                      chunk_first_log_time
);

You need to add some logic to the send_upload request function and change the signature to

send_upload_request(..., time_t chunk_first_log_time)

Comment on lines 2291 to 2258
if (!upload_file) {
ret = buffer_chunk(ctx, upload_file, chunk, chunk_size,
event_chunk->tag, flb_sds_len(event_chunk->tag),
&tms.tm.tv_sec);
}
else {
ret = buffer_chunk(ctx, upload_file, chunk, chunk_size,
event_chunk->tag, flb_sds_len(event_chunk->tag),
upload_file->first_log_time);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this conditional logic into the buffer_chunk function. It's confusing what the timep variable means.

You should always do:

    ret = buffer_chunk(ctx, upload_file, chunk, chunk_size,
                       event_chunk->tag, flb_sds_len(event_chunk->tag),
                       time_t chunk_first_log_time
);

Comment on lines 891 to 894
if (!gmtime_r(&time, &gmt)) {
gmt_tm = gmtime_r(&time, &gmt);
if (!gmt_tm) {
flb_error("[s3_key] Failed to create timestamp.");
goto error;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point of this code change? I guess it follows convention better maybe? Not sure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I advocate for removing all unnecessary code changes in this pr.

Comment on lines 36 to 39
int s3_store_buffer_put(struct flb_s3 *ctx, struct s3_file *s3_file,
const char *tag, int tag_len,
char *data, size_t bytes);
char *data, size_t bytes,
time_t *time);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your declared function signature should match your defined function signature. in otherwords, you need to make sure the .c and .h file functions look the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's time here and timep in the other file

Comment on lines 2183 to 2194
if (!upload_file) {
unit_test_flush(ctx, upload_file,
event_chunk->tag, flb_sds_len(event_chunk->tag),
chunk, chunk_size,
m_upload_file, &tms.tm.tv_sec);
}
else {
unit_test_flush(ctx, upload_file,
event_chunk->tag, flb_sds_len(event_chunk->tag),
chunk, chunk_size,
m_upload_file, upload_file->first_log_time);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put this logic into the unit_test_flush function. you should always pass in chunk_first_log_time see other comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense

@@ -50,13 +50,14 @@ static int s3_put_object(struct flb_s3 *ctx, const char *tag, time_t create_time

static int put_all_chunks(struct flb_s3 *ctx);

static void cb_s3_upload(struct flb_config *ctx, void *data);
static void cb_s3_upload(struct flb_config *ctx, void *data, time_t *timep);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, could you rename each to either:

chunk_first_log_time

or

file_first_log_time

Right now it's really confusing whether you're talking about the chunk's log timestamp (for a single flush) or the file's log timestamp (for multiple flushes) I'm having trouble following the logic

msgpack_unpacked result;
msgpack_object *obj;
size_t off = 0;
struct flb_time tms;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename this to:

chunk_first_log_time

TMS could mean

  • Current time
  • Time of first log in file
  • Time of first log in chunk

I think changing this to chunk_first_log_time would make our logic clearer and help us to avoid/discover logic problems

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@Claych Claych force-pushed the clay-s3-time-stamp-1.9 branch 5 times, most recently from e23cca2 to d822be8 Compare December 17, 2022 02:17
msgpack_object *obj;
size_t off = 0;
struct flb_time tms;
time_t file_first_log_time;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to initialize it to zero

@Claych Claych force-pushed the clay-s3-time-stamp-1.9 branch from 52fe212 to a8a34c1 Compare January 9, 2023 23:28
@Claych Claych temporarily deployed to pr January 10, 2023 02:12 — with GitHub Actions Inactive
@Claych Claych temporarily deployed to pr January 10, 2023 02:12 — with GitHub Actions Inactive
@Claych Claych temporarily deployed to pr January 10, 2023 02:29 — with GitHub Actions Inactive
@Claych Claych force-pushed the clay-s3-time-stamp-1.9 branch from a8a34c1 to c563aba Compare January 10, 2023 17:23
@Claych Claych temporarily deployed to pr January 10, 2023 18:33 — with GitHub Actions Inactive
@Claych Claych temporarily deployed to pr January 10, 2023 18:33 — with GitHub Actions Inactive

/*
* When chunk does not exist, file_first_log_time will be 0.
* This is only for unit tests. Also, this part is important, please don't remove it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why should it not be removed? You could simplify it by saying, "this prevents unit tests from segfaulting". More description would be good though.

I actually meant though to put a comment in cb_s3_flush that currently our upload functions require chunk to be non-null. To do it around this line: if ((upload_file != NULL) && (upload_timeout_check == FLB_TRUE || total_file_size_check == FLB_TRUE)) {

that the (upload_file != NULL) should not be removed.

PettitWesley
PettitWesley previously approved these changes Jan 10, 2023
@Claych Claych temporarily deployed to pr January 10, 2023 18:51 — with GitHub Actions Inactive
@Claych Claych force-pushed the clay-s3-time-stamp-1.9 branch from c563aba to f54eadf Compare January 11, 2023 18:57
time_t file_first_log_time = time(NULL);

/*
* When chunk does not exist, file_first_log_time will be 0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer true now that we changed the code, please update the comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd be ideal if you stated why teh unit test will crash: because the chunk will be null in it

PettitWesley
PettitWesley previously approved these changes Jan 12, 2023
@Claych Claych temporarily deployed to pr January 12, 2023 05:53 — with GitHub Actions Inactive
@Claych Claych temporarily deployed to pr January 12, 2023 05:54 — with GitHub Actions Inactive
@Claych Claych temporarily deployed to pr January 12, 2023 06:08 — with GitHub Actions Inactive
/* File is ready for upload */
if (upload_timeout_check == FLB_TRUE || total_file_size_check == FLB_TRUE) {
/* File is ready for upload, upload_file != NULL prevents from segfaulting. */
if ((upload_file != NULL) && (upload_timeout_check == FLB_TRUE || total_file_size_check == FLB_TRUE)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prevents unit tests from segfaulting, but not a big deal. this comment is okay

@Claych Claych temporarily deployed to pr January 17, 2023 20:54 — with GitHub Actions Inactive
@Claych Claych temporarily deployed to pr January 17, 2023 20:54 — with GitHub Actions Inactive
@Claych Claych temporarily deployed to pr January 17, 2023 21:09 — with GitHub Actions Inactive
@edsiper edsiper merged commit 60b6884 into fluent:1.9 Feb 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants