Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: rewrite UDF time_bucket using date_bin and date_trunc #867

Closed
wants to merge 49 commits into from

Conversation

dust1
Copy link
Contributor

@dust1 dust1 commented May 4, 2023

Which issue does this PR close?

Closes #558

Rationale for this change

What changes are included in this PR?

Some time_bucket functions use an implementation of date_bin

Are there any user-facing changes?

none

How does this change test

none

@dust1
Copy link
Contributor Author

dust1 commented May 6, 2023

@jiacai2050 hello. I have a problem, after I debug the truncate_week implementation of time_bucket. the ts 1659577423000 (2022-8-4 9:43:43) will truncate to 1652918400000 (2022-5-19 8:0:0). So I modified some of the test cases

@dust1 dust1 marked this pull request as ready for review May 6, 2023 16:14
@jiacai2050 jiacai2050 self-requested a review May 8, 2023 02:06
@jiacai2050
Copy link
Contributor

@dust1 Don't worry, I will review this PR this week soon.

Timestamp(1657756800000),
Timestamp(1657756800000),
Timestamp(1657756800000),
Timestamp(1656259200000),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have question about this week result.
The original ts is 1659577423000, 2022-07-04 09:43:43+08

When truncate by week, it should be 2022-07-04 00:00:00+08.

Tested aginst PostgreSQL

SELECT
    date_trunc('week', '2022-07-04T09:43:43+08'::timestamp with time zone),
    extract(epoch FROM date_trunc('week', '2022-07-04T09:43:43+08'::timestamp with time zone));

Output

       date_trunc       |      extract      
------------------------+-------------------
 2022-07-04 00:00:00+08 | 1656864000.000000
(1 row)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a time zone problem. I am currently dealing with it after trunc. this part should be placed before the function call

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Also could you update those sql to something like

SELECT timestamp, time_bucket(`timestamp`, 'P1D') FROM `02_function_time_bucket_table` order by timestamp;

@dust1
Copy link
Contributor Author

dust1 commented May 8, 2023

I tried to fix the time zone problem, but the implementation was a bit bad. Maybe there's a better way?🤔

array
.iter()
.map(|ts| {
ts.map(|t| Ok(t + DEFAULT_TIMEZONE_OFFSET_SECS as i64 * 1_000_000_000))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand here.

Usually first you need to convert timestamp to local datetime, then truncate by week/month/year.

You can refer how datafusion implement similar functions:
https://github.com/apache/arrow-datafusion/blob/8ada7fd1e949b1ff2c9207ce2143bb19157e75cd/datafusion/physical-expr/src/datetime_expressions.rs#L218

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

date_trunc is converted to UTC after receiving the timestamp, which subtracts the time zone, This will result in the wrong time zone. what do I think is wrong?😂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If date_trunc convert UTC datetime, then we can't using it directly...

I will check this carefully later.

@jiacai2050
Copy link
Contributor

Also this rewrite doesn't simplify time_bucket much, I expect it's should be much simpler than before.

Part of this PR was to convert timestamp precision, which was unnecessary in datafusion 24

I plan to upgrade datafusion first, then implement this afterwards.

@jiacai2050 jiacai2050 changed the title Deprecate custom UDF time_bucket in favor of date_bin refactor: rewrite UDF time_bucket using date_bin and date_trunc May 10, 2023
@dust1
Copy link
Contributor Author

dust1 commented May 11, 2023

Can we implement a date_trunc inside time_bucket first🤔

@jiacai2050
Copy link
Contributor

jiacai2050 commented May 12, 2023

Can we implement a date_trunc inside time_bucket first🤔

I think it make little sense to do this, the logic won't be simpler than before.

The original intention is to simplify time_bucket's implementation, so I think you can wait a little moment on this PR, after I bump datafusion to 24, much part of this PR could be removed.

@dust1 You can try other issue first, I will ping you when this PR is ready for rework.

@jiacai2050 jiacai2050 force-pushed the issue558_replenish branch from 95810f6 to b1b62fb Compare May 17, 2023 03:54
@jiacai2050
Copy link
Contributor

After bump datafusion in #894, there are following issues:

This PR cannot proceed until those two issues be fixed...

jiacai2050 and others added 9 commits June 6, 2023 20:36
## Rationale
Related with apache#967, reduce CPU consumption.

## Detailed Changes
- CeresDB/xorfilter#1
- CeresDB/xorfilter#2

## Test Plan

@zouxiang1993 will benchmark this in his test env.
## Rationale
Now the implementation of `get_range` in `ObjectStore` based on `OBKV`
may cause extra IO operation, the better way is to let table_kv provide
an API `get_batch` to avoid this.

## Detailed Changes
* Add an API `get_batch` in table_kv
* use `get_batch` implement `get_range` in `ObjectStore` based on `OBKV`

## Test Plan
By unit tests
…pache#975)

## Rationale
When doing benchmark, sst iterator and xor filter build cost too much
CPU.

## Detailed Changes
- Change RowViewOnBatchColumnIter's item from Datum to DatumView

## Test Plan
I will do benchmark in my test env.
## Rationale
Close apache#914

## Detailed Changes
Use `partition lock` in `disk cache`

## Test Plan
add ut.
## Rationale
Add generic support to generate hasher 

## Detailed Changes
Add generic support to generate hasher  for all `PartitionedLock`.

## Test Plan
Ut.
## Rationale
Part of apache#973

## Detailed Changes
- Add UT for stable hasher. (Although DefaultHasher doesn't promise
fixed code over rust release, it should be same for fixed rust version.)

## Test Plan
## Rationale
Kafka client allow setting multiple kafka boost brokers, expose this in
ceresdb.

## Detailed Changes
Allow setting multiple kafka boost brokers.

## Test Plan
Test manually.
…che#981)

## Rationale

## Detailed Changes
- Add a param(partition_num) in partition lock's init_fn.

## Test Plan
UT
zouxiang1993 and others added 27 commits August 9, 2023 14:56
## Rationale


## Detailed Changes


## Test Plan
## Rationale
Part of apache#799 
Now we run the test about recovery manually that is so tired, this pr
add this into integration tests which will be run automatically in ci.

## Detailed Changes
+ Add  integration test about recovery.
+ Add above test to ci.

## Test Plan
None.
## Rationale
Add grpc query success counter metrics.

## Detailed Changes
- Add counter metric in grpc proxy.
- Add counter metric in grpc remote engine service.

## Test Plan
Existing tests.
## Rationale
Close apache#984 

## Detailed Changes
Add a parameter to the headers of grpc to mark that it has been
forwarded.

## Test Plan
Existing tests
## Rationale
In current design, sst files may be picked multiple times.

## Detailed Changes
- Mark files as in compacting when pick files candidates, and reset it
to false when CompactionTask is dropped.

## Test Plan
Manually
## Rationale
The metadata for arrow schema is encoded into the parquet file. However,
this part is lost when building our custom metadata.

## Detailed Changes
Keep the other metadata in the parquet metadata after extracting our
custom metadata.

## Test Plan
Add unit test `test_arrow_meta_data` for it.
## Rationale
Part of apache#990.

Some background jobs are still allowed to execute, and it will lead to
data corrupted when a table is migrated between different nodes because
of multiple writers for the same table.

## Detailed Changes
Introduce a flag called `invalid` in the table data to denote whether
the serial executor is valid, and this flag is protected with the
`TableOpSerialExecutor` in table data, and the `TableOpSerialExecutor`
won't be acquired if the flag is set, that is to say, any table
operation including updating manifest, altering table and so on, can't
be executed after the flag is set because these operations require the
`TableOpSerialExecutor`. Finally, the flag will be set when the table is
closed.
## Rationale
Now there is no page index in the `meta_data`, we should build page
index if we want to use row selection.

## Detailed Changes
* build page index for `meta_data`
* add some debug log

## Test Plan

---------

Co-authored-by: Jiacai Liu <[email protected]>
Co-authored-by: WEI Xikai <[email protected]>
)

## Rationale
Part of apache#799 
We use `rskafka` as our kafka client.
However I found it will retry without limit even though kafka service is
unavailable...
(see
[https://github.com/influxdata/rskafka/issues/65](https://github.com/influxdata/rskafka/issues/65))
Worse, I found `rskafka` is almostis no longer maintained...

For quick fix, I forked it to support limited retry.

## Detailed Changes
+ Use the instead forked `rskafka`(supporting limited retry).
+ Add more logs in recovery path for better debugging.

## Test Plan
Test manually.
## Rationale
Part of apache#799 

## Detailed Changes
see title.

## Test Plan
None.
## Rationale
See title.

## Detailed Changes
See title.

## Test Plan
None.
## Rationale
See apache/incubator-horaedb-proto#81

After this, developers can set `export
CERESDBPROTO_ENABLE_VENDORED=false` to use protoc on their host.

## Detailed Changes


## Test Plan
No need.
## Rationale
close apache#1022

## Detailed Changes
Check statements' len before parse table name

## Test Plan
add ut.
## Rationale
More details about the sst are neeeded for troubleshooting problems.

## Detailed Changes
- Output some statistics about the file;
- Output compression information;

## Test Plan
Check the output of sst-meta tool.

---------

Co-authored-by: Ruixiang Tan <[email protected]>
This reverts commit 41fe63a.

## Rationale
apache#1000 leads to some commits missing.

## Detailed Changes
Revert apache#1000

## Test Plan
## Rationale
Timestamp::now() produces a random timestamp which leads to occasional
test fail.

## Detailed Changes
Use a fixed timestamp.

## Test Plan

---------

Co-authored-by: Ruixiang Tan <[email protected]>
## Rationale
Currently, ceresdb-client use arrow 23, bump to latest to remove it.

## Detailed Changes
The dependencies are updated:
```
    Removing arrow v23.0.0
    Removing arrow-buffer v23.0.0
    Updating ceresdb-client v1.0.0 -> v1.0.1
    Removing flatbuffers v2.1.2
    Removing multiversion v0.6.1
    Removing multiversion-macros v0.6.1
```

## Test Plan
## Rationale
apache#1000 was reverted in apache#1026, here to create a new one. See details in
apache#1000

## Detailed Changes
add page index for metadata.
## Rationale
Currently, the underlying storage supports binary data type. However,
during the parsing and planning, the binary data is not supported.

## Detailed Changes
Support binary data type during parsing and planning stage.

## Test Plan
New unit tests and integration tests.
…pache#1034)

This reverts commit 85eb0b7.

## Rationale
The changes introduced by apache#998 are not reasonable. Another fix will
address apache#990.

## Detailed Changes
Revert apache#998
## Rationale
Currently, we attempt to flush the table that consumes the maximum
memory when the system memory usage limit is reached for either
`space_write_buffer_size` or `db_write_buffer_size`. However, if the
target table is currently undergoing flushing, its memory usage will not
be released, causing the `preprocess_flush` (freeze small memtables)
function to be repeatedly triggered. This can result in the creation of
many small SST files, potentially causing query issues.

## Detailed Changes
* Move `preprocess_flush` into `flush_job`
* Split `swith_memtables_or_suggest_duration` into 2 methods, and make
`swith_memtables` return maxium sequence number.

## Test Plan
## Rationale
Now rocksdb as the wal, it is easy to become the bottleneck of write.

## Detailed Changes
1. expose more rocksdb options to avoid write stall
2. introduce rocksdb's FIFO compaction style, which makes rocksdb looks
like a message queue.
(FIFO is more suitable for time-series data, maybe it will become the
default option in the future.)
## Test Plan
I will test it in my test env.

---------

Co-authored-by: WEI Xikai <[email protected]>
## Rationale
Current we meet many situations with small SST, in order to debug where
the issue is, we need to know memtable's inner state.

## Detailed Changes
Add metrics() for MemTable trait, which contains three metrics now:
raw_size, encoded_size, row_cnt.

## Test Plan
Manually, when memtable flush to level0, logs will contains metrics like
```bash
2023-06-27 16:21:21.994 INFO [analytic_engine/src/instance/flush_compaction.rs:392] Instance flush memtables to output, table:system, table_id:2199023255553, request_id:4074, mems_to_flush:FlushableMemTables { sampling_mem: None,
memtables: [MemTableState { time_range: TimeRange { inclusive_start: Timestamp(1687852800000), exclusive_end: Timestamp(1687860000000) }, id: 161, mem: 27262976, metrics: Metrics { row_raw_size: 11160000, row_encoded_size: 21840000, row_count: 120000 }, last_sequence: 3872 }] }, files_to_level0:[AddFile { level: Level(0), file: FileMeta { id: 178, size: 6806645, row_num: 120000, time_range: TimeRange { inclusive_start: Timestamp(1687852800000), exclusive_end: Timestamp(1687860000000) }, max_seq: 3872, storage_format: Columnar } }], flushed_sequence:3872
```

---------

Co-authored-by: kamille <[email protected]>
## Rationale
Part of apache#904 

## Detailed Changes


## Test Plan
1. write some data points 
```
curl --location --request POST 'http://127.0.0.1:5440/opentsdb/api/put' --data-ascii '
[
{
    "metric": "sys.cpu.nice",
    "timestamp": 1687935743000,
    "value": 18,
    "tags": {
       "host": "web01",
       "dc": "lga"
    }
},
{
    "metric": "sys.cpu.nice",
    "timestamp": 1687935743000,
    "value": 18,
    "tags": {
       "host": "web01"
    }
}
]
'
```

2. select 
```
curl --location --request POST 'http://127.0.0.1:5440/sql' --data-ascii '
SELECT * from "sys.cpu.nice"
'
```

the response: 
```
{
  "rows": [
    {
      "tsid": 1890867319031064034,
      "timestamp": 1687935743000,
      "dc": null,
      "host": "web01",
      "value": 18.0
    },
    {
      "tsid": 7054964577922029584,
      "timestamp": 1687935743000,
      "dc": "lga",
      "host": "web01",
      "value": 18.0
    }
  ]
}
```
## Rationale
Add tests for apache#1037 

## Detailed Changes


## Test Plan
## Rationale

When debugging SST, it's useful to check sst ordered by
time/max_seq/size.

## Detailed Changes
- add a option `sort` 

## Test Plan
@tanruixiang
Copy link
Member

Hey, I noticed that the previous issue has been fixed by datafusion, but this pr is already too far behind the master branch. Would you be willing to merge the master branch in this pr? Or I can open another pr and invite you to review and co-author. Whichever way you choose I will work with you to resolve this issue.

@jiacai2050
Copy link
Contributor

Kinds of stale, will open new PR to address this.

@jiacai2050 jiacai2050 closed this Nov 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deprecate custom UDF time_bucket in favor of date_bin
9 participants