Skip to content

fix(syscall,vfs): fix SIG derivative when writing partial readable buffer#1375

Merged
fslongjin merged 7 commits intoDragonOS-Community:masterfrom
mistcoversmyeyes:bugfix/writev
Nov 21, 2025
Merged

fix(syscall,vfs): fix SIG derivative when writing partial readable buffer#1375
fslongjin merged 7 commits intoDragonOS-Community:masterfrom
mistcoversmyeyes:bugfix/writev

Conversation

@mistcoversmyeyes
Copy link
Contributor

@mistcoversmyeyes mistcoversmyeyes commented Nov 14, 2025

  • 引入 user_accessible_len() 以测量从给定地址 address 开始,能够被访问的最长连续字节长度(注:使用vma进行校验)
  • 使 IoVecs::gather 返回 Result 并仅聚合可以被读取的 buf 部分(注意,一旦碰到不可访问的 iov,后面的iov都会被抛弃)
  • 在 writev/pwritev 中传播新的 Result 以支持 gVisor 下的部分写入

Done

  • 理解测试用例在测试什么
  • 解决 segv 导致进程直接被kill 测试终止的问题(修复方案:在将用户地址空间的内存区域拷贝到内核中的时候,做合法性校验,检查用户是否对传入的空间具有读权限。
  • 修复返回值与预期不一致的问题
  • 使用 vma 做用户传入内存区域有效性校验

问题描述

实现系统调用 sys_pwritev 后,再次使用 gvisor 测试时 ,测试用例 WriteTest.PartialWriteSIGSEGV 会出现 访问地址 0x95a000 错误码为 0b0(表示在内核中读取一个不存在的页面) 的 pagefault , 该 pagefault 会导致内核直接 杀死请求系统调用的当前用户进程。

测试用例逻辑分析

测试当一个 iov 横跨 PROT_READ(第一页)和 PROT_NONE的页(第二页)的时候,pwritev能否正确的将前半部分的内容写到文件中,而后半部分的内容由于会触发 segv 不写入(但是神奇的是,这个 segv 应当不会导致 pwritev 调用失败,也不应该终止当前进程)。

怀疑原因

  • 在测试用例中执行写入 第二个 iov 中的 第二页(PROT_NONE的页) 的时候会在 内核态访问用户非法内存地址 的时候触发 segv,而这个 segv 会直接导致进程被kill 进而终止测试。

参考

测试 log

对应 Gtest 测试用例

// Test that partial writes that hit SIGSEGV are correctly handled and return
// partial write.
TEST_F(WriteTest, PartialWriteSIGSEGV) {
  // Allocate 2 pages and remove permission from the second.
  const size_t size = 2 * kPageSize;
  void* addr = mmap(0, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
  ASSERT_NE(addr, MAP_FAILED);
  auto cleanup = Cleanup(
      [addr, size] { EXPECT_THAT(munmap(addr, size), SyscallSucceeds()); });

  void* badAddr = reinterpret_cast<char*>(addr) + kPageSize;
  ASSERT_THAT(mprotect(badAddr, kPageSize, PROT_NONE), SyscallSucceeds());

  TempPath file = ASSERT_NO_ERRNO_AND_VALUE(TempPath::CreateFile());
  FileDescriptor fd =
      ASSERT_NO_ERRNO_AND_VALUE(Open(file.path().c_str(), O_WRONLY));

  // Attempt to write both pages to the file. Create a non-contiguous iovec pair
  // to ensure operation is done in 2 steps.
  struct iovec iov[] = {
      {
          .iov_base = addr,
          .iov_len = kPageSize,
      },
      {
          .iov_base = addr,
          .iov_len = size,
      },
  };
  // Write should succeed for the first iovec and half of the second (=2 pages).
  EXPECT_THAT(pwritev(fd.get(), iov, ABSL_ARRAYSIZE(iov), 0),
              SyscallSucceedsWithValue(2 * kPageSize));
}

@github-actions github-actions bot added ambiguous The title of PR/issue doesn't match the format Bug fix A bug is fixed in this pull request test Unitest/User space test labels Nov 14, 2025
@mistcoversmyeyes mistcoversmyeyes force-pushed the bugfix/writev branch 5 times, most recently from a91605c to b5b3f24 Compare November 17, 2025 13:38
@mistcoversmyeyes mistcoversmyeyes marked this pull request as ready for review November 17, 2025 13:38
Copilot AI review requested due to automatic review settings November 17, 2025 13:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a kernel panic in the pwritev system call when it encounters partially accessible memory regions (e.g., when an iovec spans across PROT_READ and PROT_NONE pages). The fix introduces VMA-based validation to check user memory accessibility before copying data, allowing partial writes to succeed instead of crashing the process.

Key changes:

  • Added user_accessible_len() function to validate user memory accessibility using VMA permissions
  • Modified gather() to return Result<Vec<u8>, SystemError> and handle partial reads gracefully
  • Updated syscall handlers to propagate errors from gather()

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 13 comments.

File Description
kernel/src/syscall/user_access.rs Adds user_accessible_len() function to compute contiguous accessible memory length using VMA validation
kernel/src/filesystem/vfs/iov.rs Modifies gather() to check memory accessibility and support partial reads, returning errors appropriately
kernel/src/filesystem/vfs/syscall/sys_writev.rs Updates to handle the new error-returning signature of gather()
kernel/src/filesystem/vfs/syscall/sys_pwritev.rs Updates to handle the new error-returning signature of gather()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mistcoversmyeyes mistcoversmyeyes force-pushed the bugfix/writev branch 4 times, most recently from 189fa17 to 666921b Compare November 19, 2025 15:19
@mistcoversmyeyes mistcoversmyeyes changed the title fix(mm): fix writetests WriteTest.PartialWriteSIGSEGV fix(mm): fix SIG derivative when writing partial readable buffer Nov 20, 2025
@mistcoversmyeyes mistcoversmyeyes changed the title fix(mm): fix SIG derivative when writing partial readable buffer fix(syscall,vfs): fix SIG derivative when writing partial readable buffer Nov 20, 2025
@github-actions github-actions bot removed the Bug fix A bug is fixed in this pull request label Nov 20, 2025
- 引入 user_accessible_len() 以测量从给定地址 `address` 开始,能够被拷贝的最长连续字节长度(注:使用vma进行校验)
- 使 IoVecs::gather 返回 Result 并仅聚合可以被读取的 `buf` 部分(注意,一旦碰到不可访问的 iov,后面的iov都会被抛弃)
- 在 writev/pwritev 中传播新的 Result 以支持 gVisor 下的部分写入
- 添加了对于 IoVecs 发起的分散写入的处理方式目前处理方式的注释。
- 添加了实现分散写入依赖于文件系统对于 IoVecs 写入支持的注释。
@fslongjin fslongjin merged commit 3f18af7 into DragonOS-Community:master Nov 21, 2025
10 of 11 checks passed
@mistcoversmyeyes mistcoversmyeyes deleted the bugfix/writev branch December 2, 2025 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ambiguous The title of PR/issue doesn't match the format test Unitest/User space test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants