Skip to content

edit tool adds UTF-8 BOM (EF BB BF) to files on Windows #3389

@sandynz

Description

@sandynz

Describe the bug

On Windows, Copilot CLI's edit tool prepends a UTF-8 Byte Order Mark (\xef\xbb\xbf)
to the output file after a modification, even when the original file had no BOM.

Affected version

v1.0.49

Steps to reproduce the behavior

Steps to reproduce

  1. Create a UTF-8 file without BOM (e.g., via PowerShell with explicit UTF-8 no-BOM write)
  2. Use the edit tool to replace any old_str with a new_str
  3. Read the output file's first 3 bytes:
    $b = [IO.File]::ReadAllBytes("test.md")
    "{0:X2} {1:X2} {2:X2}" -f $b[0], $b[1], $b[2]
    # Expected: first 3 bytes of actual content (e.g., "23 20 74" for "# t")
    # Actual:   EF BB BF  ← UTF-8 BOM

Expected behavior

no edit failure

Additional context

Evidence (v1.0.49, Windows 11 / PowerShell 7, 2026-05-19)

Byte-level measurement before and after edit tool call:

Before edit: CRLF=3  LF-only=0  bytes=35  BOM=false
After edit:  CRLF=3  LF-only=0  bytes=42  BOM=true
Hex header:  EF BB BF 6C 69 6E 65 20 6F 6E 65 0D 0A ...

The edit tool correctly normalizes line endings for old_str matching (LF old_str
matched CRLF file content successfully), but the output file gains a BOM and retains
CRLF — so the normalization only applies to matching, not to the written output.

Impact

  • Python open(encoding='utf-8') reads the BOM as the \ufeff character prepended to
    the file content, breaking JSON parsers, YAML parsers, and string-prefix matching.
  • git diff displays BOM bytes as binary noise, cluttering diffs.
  • .gitattributes eol=lf does not strip BOM — git treats it as opaque file content.
    Without a dedicated pre-commit hook, BOM is committed to the repository permanently.

Requested change

Use new UTF8Encoding(encoderShouldEmitUTF8Identifier: false) (or equivalent) when
writing files on Windows, so that the output is BOM-free UTF-8 — the format expected by
virtually all modern tooling and consistent with behavior on macOS/Linux.

Related

Workaround (current)

After using the edit tool, re-write the file with Python to strip BOM:

with open(path, encoding='utf-8-sig') as f:
    content = f.read()  # strips BOM
with open(path, 'w', encoding='utf-8', newline='\n') as f:
    f.write(content)

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:platform-windowsWindows-specific: PowerShell, cmd, Git Bash, WSL, Windows Terminalarea:toolsBuilt-in tools: file editing, shell, search, LSP, git, and tool call behavior

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions