-
Notifications
You must be signed in to change notification settings - Fork 200
Doc: rename ideas.yml to ideas-feature-requests.yml #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Rename ideas.yml to ideas-feature-requests.yml to let GitHub Discussions category template work.
TomShawn
approved these changes
Jul 4, 2023
Contributor
TomShawn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
TomShawn
approved these changes
Jul 4, 2023
Contributor
TomShawn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
yjhjstz
approved these changes
Jul 4, 2023
avamingli
pushed a commit
to avamingli/cloudberrydb
that referenced
this pull request
Jan 21, 2025
## Problem
An error occurs in python lib when a plpython function is executed.
After our analysis, in the user's cluster, a plpython UDF
was running with the unstable network, and got a timeout error:
`failed to acquire resources on one or more segments`.
Then a plpython UDF was run in the same session, and the UDF
failed with GC error.
Here is the core dump:
```
2023-11-24 10:15:18.945507 CST,,,p2705198,th2081832064,,,,0,,,seg-1,,,,,"LOG","00000","3rd party error log:
#0 0x7f7c68b6d55b in frame_dealloc /home/cc/repo/cpython/Objects/frameobject.c:509:5
#1 0x7f7c68b5109d in gen_send_ex /home/cc/repo/cpython/Objects/genobject.c:108:9
#2 0x7f7c68af9ddd in PyIter_Next /home/cc/repo/cpython/Objects/abstract.c:3118:14
#3 0x7f7c78caa5c0 in PLy_exec_function /home/cc/repo/gpdb6/src/pl/plpython/plpy_exec.c:134:11
#4 0x7f7c78cb5ffb in plpython_call_handler /home/cc/repo/gpdb6/src/pl/plpython/plpy_main.c:387:13
#5 0x562f5e008bb5 in ExecMakeTableFunctionResult /home/cc/repo/gpdb6/src/backend/executor/execQual.c:2395:13
#6 0x562f5e0dddec in FunctionNext_guts /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:142:5
#7 0x562f5e0da094 in FunctionNext /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:350:11
apache#8 0x562f5e03d4b0 in ExecScanFetch /home/cc/repo/gpdb6/src/backend/executor/execScan.c:84:9
apache#9 0x562f5e03cd8f in ExecScan /home/cc/repo/gpdb6/src/backend/executor/execScan.c:154:10
#10 0x562f5e0da072 in ExecFunctionScan /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:380:9
apache#11 0x562f5e001a1c in ExecProcNode /home/cc/repo/gpdb6/src/backend/executor/execProcnode.c:1071:13
apache#12 0x562f5dfe6377 in ExecutePlan /home/cc/repo/gpdb6/src/backend/executor/execMain.c:3202:10
apache#13 0x562f5dfe5bf4 in standard_ExecutorRun /home/cc/repo/gpdb6/src/backend/executor/execMain.c:1171:5
apache#14 0x562f5dfe4877 in ExecutorRun /home/cc/repo/gpdb6/src/backend/executor/execMain.c:992:4
apache#15 0x562f5e857e69 in PortalRunSelect /home/cc/repo/gpdb6/src/backend/tcop/pquery.c:1164:4
apache#16 0x562f5e856d3f in PortalRun /home/cc/repo/gpdb6/src/backend/tcop/pquery.c:1005:18
apache#17 0x562f5e84607a in exec_simple_query /home/cc/repo/gpdb6/src/backend/tcop/postgres.c:1848:10
```
## Reproduce
We can use a simple procedure to reproduce the above problem:
- set timeout GUC: `gpconfig -c gp_segment_connect_timeout -v 5` and `gpstop -ari`
- prepare function:
```
CREATE EXTENSION plpythonu;
CREATE OR REPLACE FUNCTION test_func() RETURNS SETOF int AS
$$
plpy.execute("select pg_backend_pid()")
for i in range(0, 5):
yield (i)
$$ LANGUAGE plpythonu;
```
- exit from the current psql session.
- stop the postmaster of segment: `gdb -p "the pid of segment postmaster"`
- enter a psql session.
- call `SELECT test_func();` and get error
```
gpadmin=# select test_func();
ERROR: function "test_func" error fetching next item from iterator (plpy_elog.c:121)
DETAIL: Exception: failed to acquire resources on one or more segments
CONTEXT: Traceback (most recent call last):
PL/Python function "test_func"
```
- quit gdb and make postmaster runnable.
- call `SELECT test_func();` again and get panic
```
gpadmin=# SELECT test_func();
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>
```
## Analysis
- There is an SPI call in test_func(): `plpy.execute()`.
- Then coordinator will start a subtransaction by PLy_spi_subtransaction_begin();
- Meanwhile, if the segment cannot receive the instruction from the coordinator,
the subtransaction beginning procedure return fails.
- BUT! The Python processor does not know whether an error happened and
does not clean its environment.
- Then the next plpython UDF in the same session will fail due to the wrong
Python environment.
## Solution
- Use try-catch to catch the exception caused by PLy_spi_subtransaction_begin()
- set the python error indicator by PLy_spi_exception_set()
Co-authored-by: Chen Mulong <chenmulong@gmail.com>
avamingli
pushed a commit
to avamingli/cloudberrydb
that referenced
this pull request
Jan 21, 2025
## Problem
An error occurs in python lib when a plpython function is executed.
After our analysis, in the user's cluster, a plpython UDF
was running with the unstable network, and got a timeout error:
`failed to acquire resources on one or more segments`.
Then a plpython UDF was run in the same session, and the UDF
failed with GC error.
Here is the core dump:
```
2023-11-24 10:15:18.945507 CST,,,p2705198,th2081832064,,,,0,,,seg-1,,,,,"LOG","00000","3rd party error log:
#0 0x7f7c68b6d55b in frame_dealloc /home/cc/repo/cpython/Objects/frameobject.c:509:5
#1 0x7f7c68b5109d in gen_send_ex /home/cc/repo/cpython/Objects/genobject.c:108:9
#2 0x7f7c68af9ddd in PyIter_Next /home/cc/repo/cpython/Objects/abstract.c:3118:14
#3 0x7f7c78caa5c0 in PLy_exec_function /home/cc/repo/gpdb6/src/pl/plpython/plpy_exec.c:134:11
#4 0x7f7c78cb5ffb in plpython_call_handler /home/cc/repo/gpdb6/src/pl/plpython/plpy_main.c:387:13
#5 0x562f5e008bb5 in ExecMakeTableFunctionResult /home/cc/repo/gpdb6/src/backend/executor/execQual.c:2395:13
#6 0x562f5e0dddec in FunctionNext_guts /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:142:5
#7 0x562f5e0da094 in FunctionNext /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:350:11
apache#8 0x562f5e03d4b0 in ExecScanFetch /home/cc/repo/gpdb6/src/backend/executor/execScan.c:84:9
apache#9 0x562f5e03cd8f in ExecScan /home/cc/repo/gpdb6/src/backend/executor/execScan.c:154:10
#10 0x562f5e0da072 in ExecFunctionScan /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:380:9
apache#11 0x562f5e001a1c in ExecProcNode /home/cc/repo/gpdb6/src/backend/executor/execProcnode.c:1071:13
apache#12 0x562f5dfe6377 in ExecutePlan /home/cc/repo/gpdb6/src/backend/executor/execMain.c:3202:10
apache#13 0x562f5dfe5bf4 in standard_ExecutorRun /home/cc/repo/gpdb6/src/backend/executor/execMain.c:1171:5
apache#14 0x562f5dfe4877 in ExecutorRun /home/cc/repo/gpdb6/src/backend/executor/execMain.c:992:4
apache#15 0x562f5e857e69 in PortalRunSelect /home/cc/repo/gpdb6/src/backend/tcop/pquery.c:1164:4
apache#16 0x562f5e856d3f in PortalRun /home/cc/repo/gpdb6/src/backend/tcop/pquery.c:1005:18
apache#17 0x562f5e84607a in exec_simple_query /home/cc/repo/gpdb6/src/backend/tcop/postgres.c:1848:10
```
## Reproduce
We can use a simple procedure to reproduce the above problem:
- set timeout GUC: `gpconfig -c gp_segment_connect_timeout -v 5` and `gpstop -ari`
- prepare function:
```
CREATE EXTENSION plpythonu;
CREATE OR REPLACE FUNCTION test_func() RETURNS SETOF int AS
$$
plpy.execute("select pg_backend_pid()")
for i in range(0, 5):
yield (i)
$$ LANGUAGE plpythonu;
```
- exit from the current psql session.
- stop the postmaster of segment: `gdb -p "the pid of segment postmaster"`
- enter a psql session.
- call `SELECT test_func();` and get error
```
gpadmin=# select test_func();
ERROR: function "test_func" error fetching next item from iterator (plpy_elog.c:121)
DETAIL: Exception: failed to acquire resources on one or more segments
CONTEXT: Traceback (most recent call last):
PL/Python function "test_func"
```
- quit gdb and make postmaster runnable.
- call `SELECT test_func();` again and get panic
```
gpadmin=# SELECT test_func();
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>
```
## Analysis
- There is an SPI call in test_func(): `plpy.execute()`.
- Then coordinator will start a subtransaction by PLy_spi_subtransaction_begin();
- Meanwhile, if the segment cannot receive the instruction from the coordinator,
the subtransaction beginning procedure return fails.
- BUT! The Python processor does not know whether an error happened and
does not clean its environment.
- Then the next plpython UDF in the same session will fail due to the wrong
Python environment.
## Solution
- Use try-catch to catch the exception caused by PLy_spi_subtransaction_begin()
- set the python error indicator by PLy_spi_exception_set()
Co-authored-by: Chen Mulong <chenmulong@gmail.com>
avamingli
pushed a commit
to avamingli/cloudberrydb
that referenced
this pull request
Jan 22, 2025
## Problem
An error occurs in python lib when a plpython function is executed.
After our analysis, in the user's cluster, a plpython UDF
was running with the unstable network, and got a timeout error:
`failed to acquire resources on one or more segments`.
Then a plpython UDF was run in the same session, and the UDF
failed with GC error.
Here is the core dump:
```
2023-11-24 10:15:18.945507 CST,,,p2705198,th2081832064,,,,0,,,seg-1,,,,,"LOG","00000","3rd party error log:
#0 0x7f7c68b6d55b in frame_dealloc /home/cc/repo/cpython/Objects/frameobject.c:509:5
#1 0x7f7c68b5109d in gen_send_ex /home/cc/repo/cpython/Objects/genobject.c:108:9
#2 0x7f7c68af9ddd in PyIter_Next /home/cc/repo/cpython/Objects/abstract.c:3118:14
#3 0x7f7c78caa5c0 in PLy_exec_function /home/cc/repo/gpdb6/src/pl/plpython/plpy_exec.c:134:11
#4 0x7f7c78cb5ffb in plpython_call_handler /home/cc/repo/gpdb6/src/pl/plpython/plpy_main.c:387:13
#5 0x562f5e008bb5 in ExecMakeTableFunctionResult /home/cc/repo/gpdb6/src/backend/executor/execQual.c:2395:13
#6 0x562f5e0dddec in FunctionNext_guts /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:142:5
#7 0x562f5e0da094 in FunctionNext /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:350:11
apache#8 0x562f5e03d4b0 in ExecScanFetch /home/cc/repo/gpdb6/src/backend/executor/execScan.c:84:9
apache#9 0x562f5e03cd8f in ExecScan /home/cc/repo/gpdb6/src/backend/executor/execScan.c:154:10
#10 0x562f5e0da072 in ExecFunctionScan /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:380:9
apache#11 0x562f5e001a1c in ExecProcNode /home/cc/repo/gpdb6/src/backend/executor/execProcnode.c:1071:13
apache#12 0x562f5dfe6377 in ExecutePlan /home/cc/repo/gpdb6/src/backend/executor/execMain.c:3202:10
apache#13 0x562f5dfe5bf4 in standard_ExecutorRun /home/cc/repo/gpdb6/src/backend/executor/execMain.c:1171:5
apache#14 0x562f5dfe4877 in ExecutorRun /home/cc/repo/gpdb6/src/backend/executor/execMain.c:992:4
apache#15 0x562f5e857e69 in PortalRunSelect /home/cc/repo/gpdb6/src/backend/tcop/pquery.c:1164:4
apache#16 0x562f5e856d3f in PortalRun /home/cc/repo/gpdb6/src/backend/tcop/pquery.c:1005:18
apache#17 0x562f5e84607a in exec_simple_query /home/cc/repo/gpdb6/src/backend/tcop/postgres.c:1848:10
```
## Reproduce
We can use a simple procedure to reproduce the above problem:
- set timeout GUC: `gpconfig -c gp_segment_connect_timeout -v 5` and `gpstop -ari`
- prepare function:
```
CREATE EXTENSION plpythonu;
CREATE OR REPLACE FUNCTION test_func() RETURNS SETOF int AS
$$
plpy.execute("select pg_backend_pid()")
for i in range(0, 5):
yield (i)
$$ LANGUAGE plpythonu;
```
- exit from the current psql session.
- stop the postmaster of segment: `gdb -p "the pid of segment postmaster"`
- enter a psql session.
- call `SELECT test_func();` and get error
```
gpadmin=# select test_func();
ERROR: function "test_func" error fetching next item from iterator (plpy_elog.c:121)
DETAIL: Exception: failed to acquire resources on one or more segments
CONTEXT: Traceback (most recent call last):
PL/Python function "test_func"
```
- quit gdb and make postmaster runnable.
- call `SELECT test_func();` again and get panic
```
gpadmin=# SELECT test_func();
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>
```
## Analysis
- There is an SPI call in test_func(): `plpy.execute()`.
- Then coordinator will start a subtransaction by PLy_spi_subtransaction_begin();
- Meanwhile, if the segment cannot receive the instruction from the coordinator,
the subtransaction beginning procedure return fails.
- BUT! The Python processor does not know whether an error happened and
does not clean its environment.
- Then the next plpython UDF in the same session will fail due to the wrong
Python environment.
## Solution
- Use try-catch to catch the exception caused by PLy_spi_subtransaction_begin()
- set the python error indicator by PLy_spi_exception_set()
Co-authored-by: Chen Mulong <chenmulong@gmail.com>
avamingli
pushed a commit
that referenced
this pull request
Jan 23, 2025
## Problem
An error occurs in python lib when a plpython function is executed.
After our analysis, in the user's cluster, a plpython UDF
was running with the unstable network, and got a timeout error:
`failed to acquire resources on one or more segments`.
Then a plpython UDF was run in the same session, and the UDF
failed with GC error.
Here is the core dump:
```
2023-11-24 10:15:18.945507 CST,,,p2705198,th2081832064,,,,0,,,seg-1,,,,,"LOG","00000","3rd party error log:
#0 0x7f7c68b6d55b in frame_dealloc /home/cc/repo/cpython/Objects/frameobject.c:509:5
#1 0x7f7c68b5109d in gen_send_ex /home/cc/repo/cpython/Objects/genobject.c:108:9
#2 0x7f7c68af9ddd in PyIter_Next /home/cc/repo/cpython/Objects/abstract.c:3118:14
#3 0x7f7c78caa5c0 in PLy_exec_function /home/cc/repo/gpdb6/src/pl/plpython/plpy_exec.c:134:11
#4 0x7f7c78cb5ffb in plpython_call_handler /home/cc/repo/gpdb6/src/pl/plpython/plpy_main.c:387:13
#5 0x562f5e008bb5 in ExecMakeTableFunctionResult /home/cc/repo/gpdb6/src/backend/executor/execQual.c:2395:13
#6 0x562f5e0dddec in FunctionNext_guts /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:142:5
#7 0x562f5e0da094 in FunctionNext /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:350:11
#8 0x562f5e03d4b0 in ExecScanFetch /home/cc/repo/gpdb6/src/backend/executor/execScan.c:84:9
#9 0x562f5e03cd8f in ExecScan /home/cc/repo/gpdb6/src/backend/executor/execScan.c:154:10
#10 0x562f5e0da072 in ExecFunctionScan /home/cc/repo/gpdb6/src/backend/executor/nodeFunctionscan.c:380:9
#11 0x562f5e001a1c in ExecProcNode /home/cc/repo/gpdb6/src/backend/executor/execProcnode.c:1071:13
#12 0x562f5dfe6377 in ExecutePlan /home/cc/repo/gpdb6/src/backend/executor/execMain.c:3202:10
#13 0x562f5dfe5bf4 in standard_ExecutorRun /home/cc/repo/gpdb6/src/backend/executor/execMain.c:1171:5
#14 0x562f5dfe4877 in ExecutorRun /home/cc/repo/gpdb6/src/backend/executor/execMain.c:992:4
#15 0x562f5e857e69 in PortalRunSelect /home/cc/repo/gpdb6/src/backend/tcop/pquery.c:1164:4
#16 0x562f5e856d3f in PortalRun /home/cc/repo/gpdb6/src/backend/tcop/pquery.c:1005:18
#17 0x562f5e84607a in exec_simple_query /home/cc/repo/gpdb6/src/backend/tcop/postgres.c:1848:10
```
## Reproduce
We can use a simple procedure to reproduce the above problem:
- set timeout GUC: `gpconfig -c gp_segment_connect_timeout -v 5` and `gpstop -ari`
- prepare function:
```
CREATE EXTENSION plpythonu;
CREATE OR REPLACE FUNCTION test_func() RETURNS SETOF int AS
$$
plpy.execute("select pg_backend_pid()")
for i in range(0, 5):
yield (i)
$$ LANGUAGE plpythonu;
```
- exit from the current psql session.
- stop the postmaster of segment: `gdb -p "the pid of segment postmaster"`
- enter a psql session.
- call `SELECT test_func();` and get error
```
gpadmin=# select test_func();
ERROR: function "test_func" error fetching next item from iterator (plpy_elog.c:121)
DETAIL: Exception: failed to acquire resources on one or more segments
CONTEXT: Traceback (most recent call last):
PL/Python function "test_func"
```
- quit gdb and make postmaster runnable.
- call `SELECT test_func();` again and get panic
```
gpadmin=# SELECT test_func();
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>
```
## Analysis
- There is an SPI call in test_func(): `plpy.execute()`.
- Then coordinator will start a subtransaction by PLy_spi_subtransaction_begin();
- Meanwhile, if the segment cannot receive the instruction from the coordinator,
the subtransaction beginning procedure return fails.
- BUT! The Python processor does not know whether an error happened and
does not clean its environment.
- Then the next plpython UDF in the same session will fail due to the wrong
Python environment.
## Solution
- Use try-catch to catch the exception caused by PLy_spi_subtransaction_begin()
- set the python error indicator by PLy_spi_exception_set()
Co-authored-by: Chen Mulong <chenmulong@gmail.com>
12 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Rename ideas.yml to ideas-feature-requests.yml to let GitHub Discussions category template work.
Change logs
rename ideas.yml to ideas-feature-requests.yml
Why are the changes needed?
Let GitHub Discussions Ideas/Feature-request template work.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
No core code change. Don't need test.
Contributor's Checklist
Here are some reminders before you submit the pull request:
make installcheckmake -C src/test installcheck-cbdb-parallel