Skip to content

[Bug] data queried from external table defined with error limit unexpected change #573

@congxuebin

Description

@congxuebin

Cloudberry Database version

PostgreSQL 14.4 (Cloudberry Database 1.0.0+ed64982 build commit:ed64982034245e8b46926b75f349a4a5d5b8fd67)

What happened

Testrepo testcase failed:
ErrorLogTests.test_limit ... 644.98 ms ... FAIL

It is related to this PR, #320

DDL

CREATE EXTERNAL TABLE exttab_limit_1( i int, j text ) 
LOCATION ('gpfdist://localhost:8080/exttab_limit_1.tbl') FORMAT 'TEXT' (DELIMITER '|') 
LOG ERRORS SEGMENT REJECT LIMIT 10;
CREATE EXTERNAL TABLE
-- Generate the file with lot of errors
\! python3 /code/cbdb_testrepo_src/mpp/gpdb/tests/queries/basic/exttab/errlog/sql/datagen.py 200 50 > /code/cbdb_testrepo_src/mpp/gpdb/tests/queries/basic/exttab/errlog/data//exttab_limit_2.tbl
-- reaches reject limit, use the same err table
CREATE EXTERNAL TABLE exttab_limit_2( i int, j text ) 
LOCATION ('gpfdist://localhost:8080/exttab_limit_2.tbl') FORMAT 'TEXT' (DELIMITER '|') 
LOG ERRORS SEGMENT REJECT LIMIT 2;
CREATE EXTERNAL TABLE
-- Test: LIMIT queries without segment reject limit reached
-- Note that even though we use exttab_limit_2 here , the LIMIT 3 will not throw a segment reject limit error
-- order 0

Query expected result:

with cte1 as 
(
SELECT e1.i, e2.j FROM exttab_limit_1 e1, exttab_limit_1 e2
WHERE e1.i = e2.i LIMIT 5
)
SELECT * FROM cte1, exttab_limit_2 e3 where cte1.i = e3.i LIMIT 3;
psql:/code/cbdb_testrepo_src/mpp/gpdb/tests/queries/basic/exttab/errlog/output/limit_planner.sql:33: NOTICE:  found 5 data formatting errors (5 or more input rows), rejected related input data
 i |    j     | i |    j     
---+----------+---+----------
 0 | 0_number | 0 | 0_number
 1 | 1_number | 1 | 1_number
 5 | 5_number | 5 | 5_number
(3 rows)

Query actual result:

with cte1 as 
(
SELECT e1.i, e2.j FROM exttab_limit_1 e1, exttab_limit_1 e2
WHERE e1.i = e2.i LIMIT 5
)
SELECT * FROM cte1, exttab_limit_2 e3 where cte1.i = e3.i LIMIT 3;
psql:/code/cbdb_testrepo_src/mpp/gpdb/tests/queries/basic/exttab/errlog/output/limit_planner.sql:33: NOTICE:  found 5 data formatting errors (5 or more input rows), rejected related input data
 i |    j     | i |    j     
---+----------+---+----------
 0 | 0_number | 0 | 0_number
 1 | 1_number | 1 | 1_number
 4 | 4_number | 4 | 4_numbe
(3 rows)

What you think should happen instead

No response

How to reproduce

goto testrepo folder:

cd  /code/cbdb_testrepo_src/

run testrepo testcase to reproduce:

/code/cbdb_testrepo_src/test_framework/tinc.py discover /code/cbdb_testrepo_src/mpp/gpdb/tests/queries/basic/exttab/errlog

Operating System

centos7

Anything else

No response

Are you willing to submit PR?

  • Yes, I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Labels

priority: HighAfter critical issues are fixed, these should be dealt with before any further issues.type: BugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions