[update] Update query docs of 3.0/dev, fix typo and issues (#1271)

# Versions - [x] dev - [x] 3.0 - [x] 2.1 - [ ] 2.0 # Languages - [x] Chinese - [x] English --------- Signed-off-by: Yongqiang YANG <[email protected]> Co-authored-by: Yongqiang YANG <[email protected]> Co-authored-by: Yongqiang YANG <[email protected]> Co-authored-by: yagagagaga <[email protected]> Co-authored-by: zhannngchen <[email protected]> Co-authored-by: wangtianyi2004 <[email protected]> Co-authored-by: kkop <[email protected]> Co-authored-by: Jake-00 <[email protected]> Co-authored-by: smiletan <[email protected]> Co-authored-by: hui lai <[email protected]> Co-authored-by: wudi <[email protected]>
apache · Nov 7, 2024 · 702688d · 702688d
1 parent bd5abae
commit 702688d
Show file tree

Hide file tree

Showing 382 changed files with 37,618 additions and 37,485 deletions.
diff --git a/common_docs_zh/ecosystem/hive-hll-udf.md b/common_docs_zh/ecosystem/hive-hll-udf.md
@@ -26,7 +26,7 @@ under the License.
 
 # Hive HLL UDF
 
- Hive HLL UDF 提供了在 hive 表中生成 HLL 运算等 UDF，Hive 中的 HLL 与 Doris HLL 完全一致，Hive 中的 HLL 可以通过 Spark HLL Load 导入 Doris。关于 HLL 更多介绍可以参考：[使用 HLL 近似去重](../query/duplicate/using-hll.md)
+ Hive HLL UDF 提供了在 hive 表中生成 HLL 运算等 UDF，Hive 中的 HLL 与 Doris HLL 完全一致，Hive 中的 HLL 可以通过 Spark HLL Load 导入 Doris。关于 HLL 更多介绍可以参考：[使用 HLL 近似去重](https://doris.apache.org/zh-CN/docs/query-acceleration/distinct-counts/using-hll/)
 
  函数简介：
   1. UDAF

diff --git a/docs/admin-manual/maint-monitor/tablet-repair-and-balance.md b/docs/admin-manual/maint-monitor/tablet-repair-and-balance.md
@@ -28,7 +28,7 @@ under the License.
 
 Beginning with version 0.9.0, Doris introduced an optimized replica management strategy and supported a richer replica status viewing tool. This document focuses on Doris data replica balancing, repair scheduling strategies, and replica management operations and maintenance methods. Help users to more easily master and manage the replica status in the cluster.
 
-> Repairing and balancing copies of tables with Colocation attributes can be referred to [HERE](../../query/join-optimization/colocation-join.md)
+> Repairing and balancing copies of tables with Colocation attributes can be referred to [HERE](../../query-data/join#colocate-join)
 
 ## Noun Interpretation
 

diff --git a/docs/compute-storage-decoupled/file-cache.md b/docs/compute-storage-decoupled/file-cache.md
@@ -127,7 +127,7 @@ Cache-related metrics in the SQL profile are found under SegmentIterator, includ
 | RemoteIOUseTimer                 | Time taken to read from remote storage       |
 | WriteCacheIOUseTimer             | Time taken to write to the File Cache        |
 
-You can view query performance analysis through [Query Performance Analysis](../query/query-analysis/query-analytics).
+You can view query performance analysis through [Query Performance Analysis](../query-acceleration/tuning/query-profile).
 
 ## Usage
 

diff --git a/docs/install/cluster-deployment/standard-deployment.md b/docs/install/cluster-deployment/standard-deployment.md
@@ -290,7 +290,7 @@ This is a CIDR representation that specifies the IP used by the FE. In environme
    JAVA_OPTS="-Xmx16384m -XX:+UseMembar -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xloggc:$DORIS_HOME/log/fe.gc.log.$DATE"
    ```
 
-6. Modify the case sensitivity parameter `lower_case_table_names` By default, Doris is case-sensitive for table names. If you require case-insensitive table names, you need to set this during cluster initialization. Note that once the cluster initialization is completed, the table name case sensitivity cannot be changed. Please refer to the [variable](../../query/query-variables/variables) documentation for more details on the `lower_case_table_names` setting.
+6. Modify the case sensitivity parameter `lower_case_table_names` By default, Doris is case-sensitive for table names. If you require case-insensitive table names, you need to set this during cluster initialization. Note that once the cluster initialization is completed, the table name case sensitivity cannot be changed. Please refer to the variable documentation for more details on the `lower_case_table_names` setting.
 
 **Start FE process**
 

diff --git a/...ery/duplicate/orthogonal-bitmap-manual.md → ...stinct-counts/orthogonal-bitmap-manual.md b/...ery/duplicate/orthogonal-bitmap-manual.md → ...stinct-counts/orthogonal-bitmap-manual.md
@@ -101,15 +101,6 @@ Parameters:
 
 the first parameter is the bitmap column, the second parameter is the dimension column for filtering, and the third parameter is the variable length parameter, which means different values of the filter dimension column
 
-```
-mysql> select orthogonal_bitmap_intersect(members, tag_group, 1150000, 1150001, 390006) from tag_map where  tag_group in ( 1150000, 1150001, 390006);
-+-------------------------------------------------------------------------------+
-| orthogonal_bitmap_intersect(`members`, `tag_group`, 1150000, 1150001, 390006) |
-+-------------------------------------------------------------------------------+
-| NULL                                                                          |
-+-------------------------------------------------------------------------------+
-```
-
 Explain:
 
 on the basis of this table schema, this function has two levels of aggregation in query planning. In the first layer, be nodes (update and serialize) first press filter_ Values are used to hash aggregate the keys, and then the bitmaps of all keys are intersected. The results are serialized and sent to the second level be nodes (merge and finalize). In the second level be nodes, all the bitmap values from the first level nodes are combined circularly
@@ -133,15 +124,6 @@ Parameters:
 
 The first parameter is the bitmap column, the second parameter is the dimension column for filtering, and the third parameter is the variable length parameter, which means different values of the filter dimension column
 
-```
-mysql> select orthogonal_bitmap_intersect_count(members, tag_group, 1150000, 1150001, 390006) from tag_map where  tag_group in ( 1150000, 1150001, 390006);
-+-------------------------------------------------------------------------------------+
-| orthogonal_bitmap_intersect_count(`members`, `tag_group`, 1150000, 1150001, 390006) |
-+-------------------------------------------------------------------------------------+
-|                                                                                   0 |
-+-------------------------------------------------------------------------------------+
-```
-
 Explain:
 
 on the basis of this table schema, the query planning aggregation is divided into two layers. In the first layer, be nodes (update and serialize) first press filter_ Values are used to hash aggregate the keys, and then the intersection of bitmaps of all keys is performed, and then the intersection results are counted. The count values are serialized and sent to the second level be nodes (merge and finalize). In the second level be nodes, the sum of all the count values from the first level nodes is calculated circularly
@@ -155,15 +137,6 @@ Syntax:
 
 orthogonal_bitmap_union_count(bitmap_column)
 
-```
-mysql> select orthogonal_bitmap_union_count(members) from tag_map where  tag_group in ( 1150000, 1150001, 390006);
-+------------------------------------------+
-| orthogonal_bitmap_union_count(`members`) |
-+------------------------------------------+
-|                                286957811 |
-+------------------------------------------+
-```
-
 Explain:
 
 on the basis of this table schema, this function is divided into two layers. In the first layer, be nodes (update and serialize) merge all the bitmaps, and then count the resulting bitmaps. The count values are serialized and sent to the second level be nodes (merge and finalize). In the second layer, the be nodes are used to calculate the sum of all the count values from the first level nodes
@@ -182,16 +155,6 @@ the first parameter is the Bitmap column, the second parameter is the dimension
 
 the calculators supported by the expression: & represents intersection calculation, | represents union calculation, - represents difference calculation, ^ represents XOR calculation, and \ represents escape characters
 
-```
- select orthogonal_bitmap_expr_calculate_count(user_id, tag, '(833736|999777)&(1308083|231207)&(1000|20000-30000)') from user_tag_bitmap where tag in (833736,999777,130808,231207,1000,20000,30000);
- Note: 1000, 20000, 30000 plastic tags represent different labels of users
-```
-
-```
- select orthogonal_bitmap_expr_calculate_count(user_id, tag, '(A:a/b|B:2\\-4)&(C:1-D:12)&E:23') from user_str_tag_bitmap where tag in ('A:a/b', 'B:2-4', 'C:1', 'D:12', 'E:23');
- Note: 'A:a/b', 'B:2-4', etc. are string types tag, representing different labels of users, where 'B:2-4' needs to be escaped as'B:2\\-4'
-```
-
 Explain:
 
 the aggregation of query planning is divided into two layers. The first layer of be aggregation node calculation includes init, update, and serialize steps. The second layer of be aggregation node calculation includes merge and finalize steps. In the first layer of be node, the input string is parsed in the init phase, which is converted into a suffix expression (inverse Polish), parses the calculated key value, and initializes it in the map<key, bitmap>structure; In the update phase, the underlying kernel scan dimension column (filter_column) calls back the update function, and then aggregates the bitmap in the map structure of the previous step in the unit of computing key; In the serialize stage, the bitmap of the key column is parsed according to the suffix expression, and the bitmap intersection, merge and difference set is calculated using the first in, last out principle of the stack structure. Then the final bitmap is serialized and sent to the aggregation be node in the second layer. Aggregates be nodes in the second layer, finds the union set of all bitmap values from the first layer nodes, and returns the final bitmap results
@@ -204,16 +167,6 @@ Syntax:
 
 orthogonal_bitmap_expr_calculate_count(bitmap_column, filter_column, input_string)
 
-```
- select orthogonal_bitmap_expr_calculate_count(user_id, tag, '(833736|999777)&(1308083|231207)&(1000|20000-30000)') from user_tag_bitmap where tag in (833736,999777,130808,231207,1000,20000,30000);
- Note: 1000, 20000, 30000 plastic tags represent different labels of users
-```
-
-```
- select orthogonal_bitmap_expr_calculate_count(user_id, tag, '(A:a/b|B:2\\-4)&(C:1-D:12)&E:23') from user_str_tag_bitmap where tag in ('A:a/b', 'B:2-4', 'C:1', 'D:12', 'E:23');
- Note: 'A:a/b', 'B:2-4', etc. are string types tag, representing different labels of users, where 'B:2-4' needs to be escaped as'B:2\\-4'
-```
-
 Explain:
 
 the aggregation of query planning is divided into two layers. The first layer of be aggregation node calculation includes init, update, and serialize steps. The second layer of be aggregation node calculation includes merge and finalize steps. In the first layer of be node, the input string is parsed in the init phase, converted to suffix expression Formula (inverse Polish formula), parse the calculated key value and initialize it in the map<key, bitmap>structure; In the update phase, the underlying kernel scan dimension column (filter_column) calls back the update function, and then aggregates the bitmap in the map structure of the previous step in the unit of computing key; In the serialize stage, the bitmap of the key column is parsed according to the suffix expression, and the bitmap intersection, merge and difference set is calculated using the first in, last out principle of the stack structure. Then the count value of the final bitmap is serialized and sent to the aggregation be node in the second layer.> Aggregates be nodes in the second layer, adds and sums all count values from the first layer nodes, and returns the final count result.

diff --git a/docs/query/duplicate/using-hll.md → ...acceleration/distinct-counts/using-hll.md b/docs/query/duplicate/using-hll.md → ...acceleration/distinct-counts/using-hll.md