Skip to content
This repository was archived by the owner on Mar 17, 2026. It is now read-only.

pre-commit: PR147540#2560

Closed
zyw-bot wants to merge 3 commits intomainfrom
test-run16146492295
Closed

pre-commit: PR147540#2560
zyw-bot wants to merge 3 commits intomainfrom
test-run16146492295

Conversation

@zyw-bot
Copy link
Copy Markdown
Collaborator

@zyw-bot zyw-bot commented Jul 8, 2025

Link: llvm/llvm-project#147540
Requested by: @nikic

@github-actions github-actions bot mentioned this pull request Jul 8, 2025
@zyw-bot
Copy link
Copy Markdown
Collaborator Author

zyw-bot commented Jul 8, 2025

Diff mode

runner: ariselab-64c-docker
baseline: llvm/llvm-project@875581b
patch: llvm/llvm-project#147540
sha256: e418e4a17128ea8aa825ce2bb0d751881f2c1daad2006dd67522b6c694eb4a73
commit: a3f8169

942 files changed, 941926 insertions(+), 981883 deletions(-)

Improvements:
  loop-simplifycfg.NumTerminatorsFolded 10648 -> 10660 +0.11%
  loop-idiom.NumMemCpy 9781 -> 9791 +0.10%
  div-rem-pairs.NumHoisted 3401 -> 3403 +0.06%
  simple-loop-unswitch.NumSwitches 2031 -> 2032 +0.05%
  gvn.NumPRELoopLoad 2047 -> 2048 +0.05%
  simplifycfg.NumBitMaps 2344 -> 2345 +0.04%
  correlated-value-propagation.NumAShrsConverted 5388 -> 5390 +0.04%
  instcombine.NegatorMaxInstructionsCreated 17170 -> 17176 +0.03%
  loop-instsimplify.NumSimplified 196996 -> 197062 +0.03%
  simple-loop-unswitch.NumBranches 108879 -> 108912 +0.03%
Regressions:
  instcombine.NumFactor 46917 -> 46662 -0.54%
  dse.NumRedundantStores 37600 -> 37451 -0.40%
  dse.NumFastOther 517679 -> 516556 -0.22%
  aggressive-instcombine.NumExprsReduced 22618 -> 22573 -0.20%
  correlated-value-propagation.NumNNeg 105280 -> 105108 -0.16%
  correlated-value-propagation.NumShlNUW 163502 -> 163241 -0.16%
  bdce.NumSimplified 6046 -> 6037 -0.15%
  correlated-value-propagation.NumShlNW 290186 -> 289770 -0.14%
  dse.NumGetDomMemoryDefPassed 1410060 -> 1408302 -0.12%
  correlated-value-propagation.NumShlNSW 126684 -> 126529 -0.12%

1 6 bench/abc/optimized/mioUtils.ll
48 74 bench/actix-rs/optimized/1v3445utu4y7ica.ll
22 41 bench/actix-rs/optimized/4i8sqy4dbcgvpe7w.ll
10 18 bench/annoy/optimized/annoymodule.ll
3 17 bench/arrow/optimized/chunk_resolver.ll
4 14 bench/arrow/optimized/random.ll
10 29 bench/assimp/optimized/LimitBoneWeightsProcess.ll
1 14 bench/boost/optimized/ipv6_address.ll
9 24 bench/boost/optimized/test_codecvt.ll
1 3 bench/ceres/optimized/visibility_based_preconditioner.ll
19 58 bench/clamav/optimized/Lzma2Dec.ll
24 28 bench/clamav/optimized/disasm.ll
7 64 bench/cmake/optimized/archive_blake2s_ref.ll
33 54 bench/cmake/optimized/archive_string.ll
22 93 bench/cmake/optimized/archive_write_add_filter_xz.ll
1 5 bench/coreutils-rs/optimized/jiqj5u7teuhb0o0.ll
5 3 bench/cpython/optimized/Hacl_Hash_Blake2b.ll
56 69 bench/darktable/optimized/imageio_exr.ll
6 11 bench/diesel-rs/optimized/3w4an7crsppwo0pg.ll
14 22 bench/draco/optimized/adaptive_rans_bit_encoder.ll
7 59 bench/duckdb/optimized/aria.ll
7 32 bench/duckdb/optimized/column_writer.ll
20 14 bench/duckdb/optimized/format.ll
9 6 bench/duckdb/optimized/ub_duckdb_core_functions_string.ll
12 27 bench/duckdb/optimized/ub_duckdb_storage_compression_dictionary.ll
2 10 bench/faiss/optimized/IndexHNSW.ll
5 7 bench/ffmpeg/optimized/ffmpeg_filter.ll
19 27 bench/fish-rs/optimized/87c4l3sw5gd0mi55puarpe5kb.ll
4 8 bench/folly/optimized/IPAddressV6.ll
8 32 bench/graphviz/optimized/gvdevice.ll
27 46 bench/gromacs/optimized/gmx_disre.ll
48 68 bench/grpc/optimized/frame_handler.ll
2 7 bench/harfbuzz/optimized/harfbuzz.ll
5 18 bench/hdf5/optimized/H5B2cache.ll
6 19 bench/hdf5/optimized/H5EAcache.ll
14 27 bench/hdf5/optimized/H5SMcache.ll
27 25 bench/image-rs/optimized/5ez7udly19o3uj1p.ll
2 18 bench/image-rs/optimized/5oy2v8fghrh79s8.ll
4 21 bench/jiff-rs/optimized/74pz6gzkyt3smdu0mmln4d620.ll
1 14 bench/jq/optimized/utf32_le.ll
1 6 bench/just-rs/optimized/23nlf67cmm9na4ci.ll
16 21 bench/libcxx/optimized/locale.ll
11 25 bench/libquic/optimized/cpu-intel.ll
5 16 bench/libquic/optimized/modp_b64.ll
5 15 bench/libsodium/optimized/aegis256_soft.ll
33 59 bench/libwebp/optimized/muxread.ll
14 38 bench/libwebp/optimized/vp8l_enc.ll
8 18 bench/linux/optimized/intel_guc_submission.ll
7 11 bench/linux/optimized/libata-core.ll
3 5 bench/llvm/optimized/StableFunctionMapRecord.ll
2 6 bench/logos-rs/optimized/3lrtayubazmm8yhl.ll
18 49 bench/meilisearch-rs/optimized/1wnbkg3u8l6dyln4.ll
14 19 bench/meshoptimizer/optimized/vertexcodec.ll
6 21 bench/mimalloc/optimized/random.ll
7 10 bench/minetest/optimized/guiPathSelectMenu.ll
9 11 bench/minetest/optimized/map.ll
18 26 bench/minetest/optimized/server.ll
13 55 bench/mitsuba3/optimized/bitmap.ll
6 35 bench/nix/optimized/archive.ll
9 11 bench/nix/optimized/ssh-store.ll
68 98 bench/node/optimized/simdutf.ll
15 17 bench/ockam-rs/optimized/1znr2e86bp785yod.ll
8 12 bench/ockam-rs/optimized/24riastqfxe8dcf.ll
8 12 bench/ockam-rs/optimized/4r08vyqwrxt6fmz0.ll
2 7 bench/ockam-rs/optimized/v91rpx6k3uxsm6j.ll
4 9 bench/oiio/optimized/imageio.ll
25 27 bench/oniguruma/optimized/utf16_le.ll
4 9 bench/opencv/optimized/colored_tsdf.ll
5 7 bench/opencv/optimized/inpainting.ll
4 11 bench/opencv/optimized/local_optimization.ll
8 5 bench/opencv/optimized/stereo_calib.ll
2 5 bench/openexr/optimized/ImathRandom.ll
7 21 bench/openexr/optimized/ImfFloatAttribute.ll
6 20 bench/openexr/optimized/ImfIntAttribute.ll
1 5 bench/openjdk/optimized/c1_GraphBuilder.ll
8 7 bench/openspiel/optimized/chess_board.ll
15 21 bench/openspiel/optimized/coop_box_pushing.ll
9 35 bench/openssl/optimized/cbc_cksm.ll
3 30 bench/openssl/optimized/ecb3_enc.ll
4 31 bench/openssl/optimized/ecb_enc.ll
6 31 bench/openssl/optimized/rc2_ecb.ll
7 20 bench/openssl/optimized/scrypt.ll
18 68 bench/openusd/optimized/openexr-c.ll
14 22 bench/pbrt-v4/optimized/aggregates.ll
70 96 bench/php/optimized/md5.ll
24 50 bench/pingora-rs/optimized/031lstpg0hmrazohafgtmu7kw.ll
4 5 bench/pingora-rs/optimized/bvwglp2tpp41rgrf36efmuws6.ll
5 30 bench/pola-rs/optimized/6760m760cqx6foczuj2rwbpgm.ll
6 11 bench/portaudio/optimized/pa_converters.ll
70 72 bench/postgres/optimized/bufmgr.ll
50 47 bench/pyo3-rs/optimized/249pdmmr5286g8h9.ll
4 24 bench/quinn-rs/optimized/6qfe07ylgilehcb66xdd2yk1r.ll
11 31 bench/rand-rs/optimized/qpqwmytuo9t2y51.ll
18 45 bench/redis/optimized/hyperloglog.ll
1 6 bench/regex-rs/optimized/10eccrragw6uslmk.ll
12 17 bench/regex-rs/optimized/3ixfkxlmcuecmmus.ll
3 7 bench/regex-rs/optimized/6c2onrqlphpgxx0.ll
7 15 bench/ripgrep-rs/optimized/13xy8s63iso2zwnz.ll
20 24 bench/ruby/optimized/utf8_mac.ll
2 12 bench/ruff-rs/optimized/3dfok8d8aknyc1byq695kiju1.ll
13 17 bench/ruff-rs/optimized/5w459yx4cbffhf0y4cegeyhh9.ll
15 17 bench/ruff-rs/optimized/axdrkm8qt3gi3305yq5vb2v4v.ll
3 7 bench/ruff-rs/optimized/dq0qakgq321c81xaqsh8asz0x.ll
14 19 bench/rust-analyzer-rs/optimized/2n800w7wl0k2x7go.ll
5 15 bench/rust-analyzer-rs/optimized/5cuaio8coq8lvmol.ll
7 12 bench/rustfmt-rs/optimized/1mznjg1e09hdetpr.ll
18 51 bench/rustfmt-rs/optimized/4gk399kploc9gcsb.ll
4 8 bench/sdl/optimized/SDL_hidapi_lg4ff.ll
16 26 bench/sdl/optimized/SDL_hidapi_stadia.ll
4 8 bench/sentencepiece/optimized/unigram_model_trainer.ll
5 28 bench/stb/optimized/stb_dxt.ll
7 11 bench/tokio-rs/optimized/57t0n8x1l283uqlx.ll
7 11 bench/tokio-rs/optimized/5fqt3exrqd05oqq2.ll
5 19 bench/typst-rs/optimized/2d3c2n5y91mtl0x0.ll
1 9 bench/typst-rs/optimized/3z60jkym58xbhjyi.ll
21 41 bench/typst-rs/optimized/49m3cs7hus53ztof.ll
6 23 bench/typst-rs/optimized/5z4no3nnr5v1s13.ll
2 4 bench/uv-rs/optimized/4hbm2zbqtyqu4boi9nw4e95w8.ll
16 20 bench/uv-rs/optimized/7ti6afa5yiixvf7gep1srhf0f.ll
5 15 bench/uv-rs/optimized/bi4c58bghet8qnxsc146d76yy.ll
2 11 bench/uv-rs/optimized/bo73jiung7bwg1q3wyjtibak2.ll
10 23 bench/wasmi-rs/optimized/0ub1jde20vya53xoib9dbvqz5.ll
17 19 bench/wasmi-rs/optimized/6o6vznnhlvcq953pdqfx9sak5.ll
25 29 bench/wasmi-rs/optimized/brtjllbaryvnnezonl24vb7vd.ll
4 8 bench/wasmtime-rs/optimized/1r2x5absurxbrq18.ll
7 6 bench/wasmtime-rs/optimized/4ab4rlryc5h7bf6z.ll
18 21 bench/wasmtime-rs/optimized/wtp2wi3bcje8i2h.ll
35 48 bench/wireshark/optimized/dot11decrypt_wep.ll
6 10 bench/wireshark/optimized/packet-someip.ll
14 56 bench/wireshark/optimized/packet-zbee-security.ll
5 6 bench/wireshark/optimized/qcustomplot.ll
11 15 bench/wolfssl/optimized/internal.ll
7 62 bench/wolfssl/optimized/poly1305.ll
3 4 bench/zed-rs/optimized/e0nyk03b5twszr55stktey742.ll
1 8 bench/zed-rs/optimized/e8p2cuwt1sxb20ryu42v8urkr.ll
2 14 bench/zed-rs/optimized/ecdic6bd9l1pqf3dw7u7642wb.ll
1 4 bench/zstd/optimized/zstd_v02.ll
18 20 bench/zxing/optimized/zueci.ll

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jul 8, 2025

The patch contains several LLVM IR optimizations that simplify and improve memory access patterns by replacing sequences of truncation and shifting operations with more efficient, direct stores. Below is a high-level summary of the 5 most significant changes across the files:

  1. Merging Split i32 Stores into Single i64 Stores

    • In multiple functions (e.g., Exp_Truth6.exit, Lzma2Decode, archive_compressor_xz_close, etc.), two separate 32-bit truncations and stores are replaced with a single 64-bit store.
    • Example: Instead of truncating and storing low/high parts via trunc and lshr, the code now directly uses store i64.
    • Impact: Reduces instruction count, simplifies generated code, improves data handling efficiency.
  2. Replacing Truncate + Shift Sequences with Bitwise Operations

    • Several places where values were split using trunc and lshr to write individual bytes or words are now using bitwise and and shifts on the full value before storage.
    • Example: Replacing trunc i32 %x to i8 followed by shifted truncations with and i32 %x, 65535 and similar patterns.
    • Impact: Reduces number of instructions and makes use of wider register operations; likely improves performance and readability.
  3. Optimized Alloca Alignment Based on Type Size

    • Adjustments made to alignment attributes for alloca instructions (e.g., from align 1 or align 4 to align 8 or align 16) to better match the size and usage of the allocated types.
    • Example: %2 = alloca [8 x i8], align 1align 4, and in other cases, alignment increased to match pointer or i64 requirements.
    • Impact: Improves memory access performance through correct alignment, reducing potential penalties from unaligned accesses.
  4. Use of Wider Store and Load Instructions

    • Multiple 8- or 32-bit stores are replaced with 64-bit or 16-bit stores when possible, especially for struct fields and internal buffers.
    • Example: Storing two i32s into a pair of i8s removed in favor of one store i64.
    • Impact: More compact and faster memory writes, fewer load/store operations overall.
  5. Reduction in Phi and Label Complexity After Control Flow Edits

    • Some blocks show simplified control flow after label reordering and phi node reduction.
    • For example, in several loops, the number of phi nodes and branches has been reduced by combining logic or using bit manipulation earlier.
    • Impact: Slightly improved CFG structure and possibly better branch prediction and optimization opportunities downstream.

Summary Word Count: ~170 words
Focus: These changes primarily reduce redundant trunc/shr/load/store sequences, improve alignment, and simplify control flow structures. They reflect backend lowering improvements or mid-end transformations that make better use of native register width and memory operations.

model: qwen-plus-latest
CompletionUsage(completion_tokens=663, prompt_tokens=115548, total_tokens=116211, completion_tokens_details=None, prompt_tokens_details=None)

br i1 %exitcond.not, label %47, label %31, !llvm.loop !26

47: ; preds = %31
call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 16 dereferenceable(32) %4, ptr noundef nonnull align 4 dereferenceable(32) %0, i64 32, i1 false)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improvement.

@@ -922,9 +920,7 @@ entry:
%rect.sroa.4.0.DesiredRect.sroa_idx = getelementptr inbounds nuw i8, ptr %this, i64 100
store i32 0, ptr %rect.sroa.4.0.DesiredRect.sroa_idx, align 4, !tbaa !54
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing combine.

%.sroa.23.0.insert.shift.i157 = shl nuw i32 %.sroa.23.0.insert.ext.i156, 16
%.sroa.15.0.insert.ext.i158 = zext i16 %49 to i32
%.sroa.15.0.insert.insert.i160 = or disjoint i32 %.sroa.23.0.insert.shift.i157, %.sroa.15.0.insert.ext.i158
store i32 %.sroa.15.0.insert.insert.i160, ptr %16, align 2, !alias.scope !7892
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dtcxzyw dtcxzyw closed this Aug 2, 2025
@dtcxzyw dtcxzyw deleted the test-run16146492295 branch August 2, 2025 06:40
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants