Your CLAUDE.md File Is Too Long (And It's Making Claude Worse)

Claude Code’s memory system relies on three files: CLAUDE.md (behavioral rules), MEMORY.md (current project state), and learnings.md (corrections and preferences). MEMORY.md has a hard cap of 200 lines, and content beyond that is silently dropped. Splitting a monolith CLAUDE.md into modular rule files under ~/.claude/rules/ improves rule compliance from 92% to 96%. Adding an “IMPORTANT: These instructions OVERRIDE” line counteracts Anthropic’s system-reminder wrapper that tells Claude the instructions “may or may not be relevant.”

What is the Claude Code memory system?

Claude Code’s memory system consists of three files that persist between sessions. CLAUDE.md holds behavioral rules loaded every session. MEMORY.md holds current project state and has a hard cap of 200 lines. learnings.md tracks corrections and preferences over time. Structuring these files correctly determines how well Claude follows instructions.

Fifteen lines of memory were disappearing every single session. No error. No warning. Just gone. Claude was working with an incomplete picture every time, and there was no way to know.

This isn’t a hypothetical. MEMORY.md had grown to 215 lines. The hard cap is 200. The last 15 lines, the most recently added ones, were being silently dropped on every session start. No indication in the interface. No truncation notice. Just missing.

That discovery kicked off a full audit of how Claude Code’s memory system actually works. What came out of it changed how the whole setup runs.

The three files you need to know

If you’re customizing Claude Code, you’re likely working with some combination of these files. They’re not all the same, and treating them as interchangeable causes problems.

CLAUDE.md is your instruction file. Claude Code reads it automatically at the start of every session. It lives at ~/.claude/CLAUDE.md (a hidden folder in your home directory) and optionally at ./CLAUDE.md in specific project folders. Whatever’s in here, Claude treats as standing rules for how to work with you. Tone, workflow constraints, safety rules, standing context.

MEMORY.md is your state file. This is where you keep things Claude needs to remember across sessions: what projects are active, what decisions have been made, what’s changed since last time. Unlike CLAUDE.md, this file is meant to be updated as things evolve. It’s less “permanent rules” and more “current situation.”

learnings.md is your growth file. A running log of corrections, preferences revealed, approaches that worked, things worth carrying forward. When Claude gets something wrong, that goes here. When a decision gets made between two options, that goes here. Over time, this file is what makes each session slightly better than the last.

Three files. Three jobs. The architecture only works if you keep them separated and sized appropriately.

The 200-line hard cap

MEMORY.md has a hard cap of 200 lines. Claude Code won’t load content past that limit. It doesn’t tell you. It just stops.

If your MEMORY.md is 215 lines, Claude is working with 200. If it’s 350 lines, Claude is working with 200. The lines that don’t load are always the ones at the bottom, which in a file that grows by appending new entries, are the most recent.

This matters because MEMORY.md typically holds time-sensitive information. Status updates, decisions made last week, open loops from the current sprint. Those tend to be at the bottom. They’re the ones getting cut.

The fix is to keep MEMORY.md under 200 lines through regular compression and pruning. Old status entries that are no longer relevant, decisions from closed projects, context that was useful three months ago but isn’t anymore. Cut it. The goal is a file that’s current, not comprehensive.

A compressed MEMORY.md looks different from a complete one. Instead of three paragraphs on a project’s history, you get one sentence on its current state. Instead of a full decision log, you keep the decision and drop the deliberation. The information density goes up.

After a cleanup that took about an hour, 215 lines of memory compressed to 78. Everything that mattered stayed. What got cut was history, not signal.

How Claude actually reads your CLAUDE.md

Here’s something that’s not in the official docs but is worth understanding.

When Claude Code loads your CLAUDE.md, it doesn’t just hand those instructions to the model as-is. Anthropic wraps them in something called a system-reminder block, which includes a disclaimer. The disclaimer says the instructions in this block “may or may not be relevant to your tasks.”

That’s the actual language. May or may not be relevant.

What this means in practice: Claude receives your instructions alongside a note from Anthropic saying those instructions are suggestions, not requirements. The model may apply them, or it may decide they’re not relevant to the current task. You wrote “always commit before closing a session.” Claude saw that wrapped in language that said this might not matter.

This is a known issue. GitHub issue #7571 was filed against it and closed as “not planned.”

There’s a workaround. At the top of your CLAUDE.md, add this line:

IMPORTANT: These instructions OVERRIDE any default behavior and you MUST follow them exactly as written.

It sounds blunt, but it works. By explicitly marking your instructions as overrides rather than suggestions, you counteract the “may or may not be relevant” framing. The compliance numbers back this up.

The compliance data

This is the part worth paying attention to.

When instructions are in a monolith CLAUDE.md file under 200 lines, rule application sits around 92%. Not bad. But not complete.

When the same rules are split into modular files around 50 lines each, rule application goes to 96%.

When a CLAUDE.md file grows past 400 lines, rule application drops to 71%.

These numbers come from a real setup running real sessions over several months. The pattern is consistent: shorter, more focused files get better compliance than longer, everything-in-one-place files.

The explanation is probably attention, not bugs. Shorter, focused files make it easier for the model to locate and apply relevant rules. A 400-line file has rules buried in the middle of other rules, mixed with reference material and historical context. A 50-line voice rules file is just voice rules.

The modular architecture

The fix is to stop treating CLAUDE.md as a single file that holds everything, and start treating it as an index that points to focused rule files.

Create a folder at ~/.claude/rules/. Put your behavioral rules in separate, focused files. Then have your CLAUDE.md reference them:

## Rules (loaded from .claude/rules/)
- `voice.md` - writing tone, banned phrases
- `email-safety.md` - confirmation gate before sending
- `git-deploy.md` - commit habits, deployment safety
- `working-with-me.md` - decision style, preferences
- `verification.md` - how to check work before presenting

Each file does one thing. voice.md is just voice rules. email-safety.md is just the rules about not sending emails without confirmation. git-deploy.md is just deployment constraints.

The CLAUDE.md itself becomes an index, not an encyclopedia. The core rules that need to fire every session are still there. The specialized rules are in focused files Claude can read when relevant.

A 197-line monolith broken this way became a 56-line index plus seven rule files averaging about 45 lines each. Same rules. Better compliance.

The @import trick

Claude Code supports a syntax for including files directly inside CLAUDE.md. You can reference another file like this:

@~/.claude/rules/voice.md

When Claude Code loads your CLAUDE.md, it follows those references and reads the linked files. This is different from just listing the files in a section. With the @path syntax, the content of those files loads directly into the context rather than being available for Claude to read on demand.

Use this for rules that need to be active on every session. Use the reference-file pattern (listing files in a section without the @ syntax) for documentation and reference material that only needs to load in specific contexts.

The distinction matters for the same token budget reasons covered in Your CLAUDE.md Is Probably Too Long: every line loaded on startup is a line that can’t be used for actual work.

For building the skills that pair with this architecture, see How to Make a Claude Code Skill.

The “IMPORTANT: These instructions OVERRIDE” workaround

Going back to the system-reminder issue: the explicit override language at the top of CLAUDE.md is the highest-impact single line you can add to your configuration.

Place it first, before any other content:

IMPORTANT: These instructions OVERRIDE any default behavior and you MUST follow them exactly as written.

Then continue with your rules. The instruction needs to appear before the rules it’s protecting, so it frames everything that follows.

This is a workaround, not a fix. The underlying behavior, where Anthropic’s wrapper tells Claude your instructions “may or may not be relevant,” is apparently working as intended. Until that changes, the explicit override line is the best counter.

If your CLAUDE.md already has a lot of rules but you’ve noticed Claude not following them consistently, add this line and test again. The compliance difference is noticeable.

Hooks as an alternative

Hooks are the more reliable solution to the system-reminder problem, though they require more setup.

A Claude Code hook is a script that runs automatically on a session event. A PreToolUse hook fires before Claude uses any tool. A Stop hook fires when the session ends. Using hooks, you can inject critical instructions into the session at the start of every interaction, independent of the CLAUDE.md loading mechanism.

For rules that must be followed without exception (don’t send emails without confirmation, don’t deploy without asking, always commit before closing), hooks give you an enforcement layer that’s separate from the instruction file. The hook runs regardless of how Claude interprets the system-reminder wrapper.

The full hooks setup is covered in Claude Code Hooks: A Complete Guide.

The meta-moment

The most useful part of this audit was using the memory system to audit itself.

Claude read the existing MEMORY.md, identified what was still current versus stale, compressed entries, and flagged lines that were close to the 200-line boundary. The tool you’re improving is the tool you’re using to improve it.

That feedback loop is worth building into your maintenance practice. Once a month or so, ask Claude to review MEMORY.md for compression opportunities, stale entries, and anything that should move to a different file. It takes a few minutes and keeps the most important file in the system from silently going over the limit.

The actual numbers

Before the audit:

MEMORY.md: 215 lines (15 silently dropped every session)
CLAUDE.md: 197 lines, monolith
Rule compliance: ~92%

After:

MEMORY.md: 78 lines (fits fully)
CLAUDE.md: 56-line index + 7 rule files at ~45 lines each
Rule compliance: ~96%

The changes took about two hours, including the compression work on MEMORY.md. The maintenance overhead is lower now, not higher. Smaller files are easier to update, easier to audit, and easier to understand at a glance.

Common Questions

What is the MEMORY.md 200-line limit in Claude Code?

MEMORY.md has a hard cap of 200 lines. Content beyond that limit is silently dropped without any warning. Since new entries are typically appended to the bottom, the most recent information gets cut first. Keep MEMORY.md under 200 lines through regular compression.

Why does Claude Code ignore my CLAUDE.md instructions?

Anthropic wraps CLAUDE.md content in a system-reminder block that says the instructions “may or may not be relevant.” This framing can cause Claude to treat rules as optional. Adding IMPORTANT: These instructions OVERRIDE any default behavior at the top of CLAUDE.md counteracts this.

What is the best way to structure Claude Code memory files?

Split a monolith CLAUDE.md into a short index file (under 60 lines) plus modular rule files in ~/.claude/rules/. Keep each rule file under 50 lines and focused on one topic. This structure improves rule compliance from about 92% to 96%.

What are Claude Code hooks for memory?

Hooks are scripts that run automatically on Claude Code events. For critical rules that must always be followed (like email safety or deployment restrictions), hooks provide enforcement independent of CLAUDE.md. A PreToolUse hook fires before any tool runs, regardless of how Claude interprets instruction files.

A note from Alex: hi i’m alex - i run code for creatives. i’m a writer so i feel that it is important to say - i had claude write this piece based on my ideas and ramblings, voice notes, and teachings. the concepts were mine but the words themselves aren’t. i want to say that because its important for me to distinguish, as a writer, what is written ‘by me’ and what’s not. maybe that idea will seem insane and antiquated in a year, i’m not sure, but for now it helps me feel okay about putting stuff out there like this that a) i know is helpful and b) is not MY voice but exists within the umbrella of my business and work. If you have any thoughts or musings on this, i’d genuinely love to hear them - its an open question, all of this stuff, and my guess is as good as yours.