Skip to content

[WebAssembly] Define __funcref_call_table in generated asm and objects#180900

Open
QuantumSegfault wants to merge 1 commit intollvm:mainfrom
QuantumSegfault:wasm-define-funcrefcalltable
Open

[WebAssembly] Define __funcref_call_table in generated asm and objects#180900
QuantumSegfault wants to merge 1 commit intollvm:mainfrom
QuantumSegfault:wasm-define-funcrefcalltable

Conversation

@QuantumSegfault
Copy link
Copy Markdown
Contributor

Currently, calls into funcref (call addrspace(20)) are lowered into a store of the ref into a special single-element __funcref_call_table, a call_indirect into said table, then a store of null into it.

It functions just fine when -filetype=asm (default), but when emitting an object file directly from llc it fails with LLVM ERROR: undefined table symbol cannot be weak

This fixes said error by "defining" it during WebAssemblyAsmPrinter, and ensuring it is definitely emitted as weak (so duplicate tables in multiple objects are merged into a single one).

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Feb 11, 2026

@llvm/pr-subscribers-backend-webassembly

Author: Demetrius Kanios (QuantumSegfault)

Changes

Currently, calls into funcref (call addrspace(20)) are lowered into a store of the ref into a special single-element __funcref_call_table, a call_indirect into said table, then a store of null into it.

It functions just fine when -filetype=asm (default), but when emitting an object file directly from llc it fails with LLVM ERROR: undefined table symbol cannot be weak

This fixes said error by "defining" it during WebAssemblyAsmPrinter, and ensuring it is definitely emitted as weak (so duplicate tables in multiple objects are merged into a single one).


Full diff: https://github.com/llvm/llvm-project/pull/180900.diff

2 Files Affected:

  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp (+16)
  • (modified) llvm/test/CodeGen/WebAssembly/funcref-call.ll (+1)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
index 1cacdb04fa74d..afe834b077909 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
@@ -392,6 +392,22 @@ void WebAssemblyAsmPrinter::emitEndOfAsmFile(Module &M) {
   // emitDecls() is not called until now.
   emitDecls(M);
 
+  {
+    StringRef Name = "__funcref_call_table";
+    auto *Sym = static_cast<MCSymbolWasm *>(OutContext.lookupSymbol(Name));
+    if (Sym) {
+      if (!Sym->isFunctionTable())
+        OutContext.reportError(SMLoc(), "symbol is not a wasm funcref table");
+
+      if (Sym->isWeak()) {
+        OutStreamer->emitSymbolAttribute(Sym, MCSA_Weak);
+      }
+      if (!Sym->isDefined()) {
+        OutStreamer->emitLabel(Sym);
+        OutStreamer->addBlankLine();
+      }
+    }
+  }
   // When a function's address is taken, a TABLE_INDEX relocation is emitted
   // against the function symbol at the use site.  However the relocation
   // doesn't explicitly refer to the table.  In the future we may want to
diff --git a/llvm/test/CodeGen/WebAssembly/funcref-call.ll b/llvm/test/CodeGen/WebAssembly/funcref-call.ll
index 9904df2280e81..4f6025de4759c 100644
--- a/llvm/test/CodeGen/WebAssembly/funcref-call.ll
+++ b/llvm/test/CodeGen/WebAssembly/funcref-call.ll
@@ -1,6 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s --mtriple=wasm32-unknown-unknown -fast-isel=0 -mattr=+reference-types | FileCheck %s
 ; RUN: llc < %s --mtriple=wasm32-unknown-unknown -fast-isel=1 -mattr=+reference-types | FileCheck %s
+; RUN: llc < %s --mtriple=wasm32-unknown-unknown --filetype=obj
 
 %funcref = type ptr addrspace(20) ;; addrspace 20 is nonintegral
 

@QuantumSegfault
Copy link
Copy Markdown
Contributor Author

Requesting review

@dschuff

@QuantumSegfault
Copy link
Copy Markdown
Contributor Author

Follow up to the problems encountered during #162227


I have absolutely no idea if this is an appropriate fix...but it seems to work. Though this places the definition at the end of file when dumping assembly...there must be a better place for it.


Low priority I guess, since clearly nobody is using this broken feature. On the upside that means there's little resistance to changing it.

@dschuff
Copy link
Copy Markdown
Member

dschuff commented Feb 12, 2026

I think we should maybe be able to treat this like the regular function table symbol (__indirect_function_table) that's used (when the CallIndirectOverlong feature is enabled). In that case that symbol is used in the same way; each call_indirect has a relocation against that symbol. It looks like WebAssembly::getOrCreateFuncrefCallTableSymbol is just defining the table as weak on creation rather than ever leaving it undefined.

@QuantumSegfault
Copy link
Copy Markdown
Contributor Author

It looks like WebAssembly::getOrCreateFuncrefCallTableSymbol is just defining the table as weak on creation rather than ever leaving it undefined.

Well, it is marking it as weak, but that doesn't define it. It's still undefined (hence the error that "undefined symbols can't be weak" when trying to emit object files).


Not sure exactly what you mean about treating it the same as the other? When CallIndirectOverlong is enabled, call_indirect takes a table argument, so the reference to the __funcref_call_table would insert a relocation already.


The problem is right now we are creating the __funcref_call_table, but never defining it. The linker synthesizes the __indirect_function_table normally (and objects might emit tentative versions as well???). But it does not do this for __funcref_call_table. It seems the intention was to mark it weak, and just have each module define an each module define an identical empty table (since there's otherwise nothing special about it; just needs to be an empty one element funcref table). But it still wasn't ever "defined", so then it being weak caused errors.

@QuantumSegfault
Copy link
Copy Markdown
Contributor Author

Ping

@dschuff

@dschuff
Copy link
Copy Markdown
Member

dschuff commented Mar 2, 2026

@sbc100
Do you recall a particular reason that we have the linker synthesize the indirect call table rather than just making it weak? If so does that logic apply here too? Or why we synthesize it rather than linking in an object file that defines it to every program? (I guess for the last case, that's much less convenient than synthesizing it).

@sbc100
Copy link
Copy Markdown
Collaborator

sbc100 commented Mar 2, 2026

@sbc100 Do you recall a particular reason that we have the linker synthesize the indirect call table rather than just making it weak? If so does that logic apply here too? Or why we synthesize it rather than linking in an object file that defines it to every program? (I guess for the last case, that's much less convenient than synthesizing it).

I think it just made sense at the time (and still does). Since both the the linear memory and the indirection function table are synthesized by the linker, its makes sense that that symbols that point to them also be created by the linker.

In particular, no external object could define them because only the linker knows how big they need to be.

OutStreamer->emitSymbolAttribute(Sym, MCSA_Weak);
}
if (!Sym->isDefined()) {
OutStreamer->emitLabel(Sym);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems rather odd. It seems like we are emitting definition for the symbol exactly then its not defined (i.e. when no definition should be emitted).

In general this seems like the wrong place to be doing this kind of thing. It seems a little too magical. How/where is the definition of __indirect_function_table handled?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...did I never press send?


emitLabel is the very thing that marks a given symbol as "defined". I took the approach from the handling of globals in emitGlobalVariable.

https://github.com/QuantumSegfault/llvm-project/blob/80479a78a5128012ec0b09f780bea1d8710337f4/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp#L213-L222

A symbol is never defined until it has been given a Fragment to? Which emitLabel does. Since we have a free floating symbol, there's no other reason for it to be emitted, so we need to call emitLabel (which confusingly, also has an effect in object emission) somewhere. Though I agree that the position (at the end of the file) is odd. Don't know where else to stick it though. Is there a more appropriate hook?

__indirect_function_table is handled directly in lld, and is never defined in the intermediate objects. It directly creates a DefinedTable symbol in the linker. Here I'm trying to emit a weak definition of it into every object instead, as there's nothing complex about it.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So you are trying to essentially convert this symbols from "weak undefined" to "weak defined" during ASM generation?

This still seems wrong. Can/should we not instead change the original definition of __indirect_function_table from weak undefined to weak defined?

Better still shouldn't the symbol be "strong" + "undefined"? i.e. the symbol is not optional, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps there's some confusion? I'm trying to fix __funcref_call_table...not __indirect_function_table.

Ever since it's inception __funcref_call_table was declared like this.

MCSymbolWasm *WebAssembly::getOrCreateFuncrefCallTableSymbol(
    MCContext &Ctx, const WebAssemblySubtarget *Subtarget) {
  StringRef Name = "__funcref_call_table";
  auto *Sym = static_cast<MCSymbolWasm *>(Ctx.lookupSymbol(Name));
  if (Sym) {
    if (!Sym->isFunctionTable())
      Ctx.reportError(SMLoc(), "symbol is not a wasm funcref table");
  } else {
    Sym = static_cast<MCSymbolWasm *>(Ctx.getOrCreateSymbol(Name));

    // Setting Weak ensure only one table is left after linking when multiple
    // modules define the table.
    Sym->setWeak(true);

    wasm::WasmLimits Limits = {0, 1, 1, 0};
    wasm::WasmTableType TableType = {wasm::ValType::FUNCREF, Limits};
    Sym->setType(wasm::WASM_SYMBOL_TYPE_TABLE);
    Sym->setTableType(TableType);
  }
  // MVP object files can't have symtab entries for tables.
  if (!(Subtarget && Subtarget->hasCallIndirectOverlong()))
    Sym->setOmitFromLinkingSection();
  return Sym;
}

If any exist, they can all be merged into the same. However when outputting -filetype=obj directly from llc, it would just error out with LLVM ERROR: undefined table symbol cannot be weak. It never worked.

I do want to set "weak" + "defined", but my approach was the only one I found. emitLabel IS what defines the MCSymbol, strangely enough. Do you know another way to define an MCSymbol?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so the problem is that you have defined symbol that is not being correctly emitted in the assembly?

Wouldn't we want a solution that works for all table symbols, not just __funcref_call_table?

Also, it seem that Sym->isDefined is wrong about this table its it returning false, since this is a defined symbol, right?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer that solution actually since it matches what we do for other symbols.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libc doesn't make sense here.

Is compiler-rt linked in ALL invocations of wasm-ld? If not, then it's not a viable spot either.

So the only other option if we REALLY don't want to weak define the symbol in every relevant object is to have lld define it in a similar manner as __indirect_function_table. It just didn't seem necessary to complicate that when the definition of the table is nothing special (i.e. no non-default initialization of the table; just a simple, 1 element, ref.null initialized table).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @sbc100 @dschuff

Question still remains where to stick this? Options remain MAYBE compiler-rt assuming it's ALWAYS linked in (including for front-ends other than Clang, no sysroot builds, etc.), moving it to wasm-ld, or finding a more appropriate place to make it weak.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should depend on it being defined by the toolchain somewhere, but not defined in clang itself or in wasm-ld.

libcompiler-rt seems like the most logical place, but maybe just libc (since it recently came to my attention that compiler-rt is designed to be statically linked into each shared object, so if it was to be defined in compiler-rt it would need to be a weak def).

Copy link
Copy Markdown
Contributor Author

@QuantumSegfault QuantumSegfault Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. We can't rely on libc. Because then we tie anybody wanting to call through funcref's to (wasi) libc for no good reason.

Just a small .S or .ll file in libcompiler-rt might do it (yes, defined weak).


I guess, what do you have against the PR right now? Why is weakly defined per-object such a bad thing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants