AGENTS.md - psych-merge Development Guide
π― Project Overview
psych-merge is a format-specific implementation of the *-merge gem family for YAML files. It provides intelligent YAML file merging using AST analysis via Rubyβs standard library Psych parser.
Core Philosophy: Intelligent YAML merging that preserves structure, comments, anchors, and formatting while applying updates from templates.
Repository: https://github.com/kettle-rb/psych-merge
Current Version: 1.0.0
Required Ruby: >= 3.2.0 (currently developed against Ruby 4.0.1)
ποΈ Architecture: Format-Specific Implementation
What psych-merge Provides
-
Psych::Merge::SmartMergerβ YAML-specific SmartMerger implementation -
Psych::Merge::FileAnalysisβ YAML file analysis with mapping/sequence extraction -
Psych::Merge::NodeWrapperβ Wrapper for Psych AST nodes (mappings, sequences, scalars) -
Psych::Merge::MappingEntryβ Key-value pair representation -
Psych::Merge::MergeResultβ YAML-specific merge result -
Psych::Merge::ConflictResolverβ YAML conflict resolution -
Psych::Merge::FreezeNodeβ YAML freeze block support -
Psych::Merge::DebugLoggerβ Psych-specific debug logging
Key Dependencies
| Gem | Role |
|---|---|
ast-merge (~> 4.0) |
Base classes and shared infrastructure |
tree_haver (~> 5.0) |
Unified parser adapter (wraps Psych) |
psych (stdlib) |
Rubyβs built-in YAML parser |
version_gem (~> 1.1) |
Version management |
Parser Backend
psych-merge uses Rubyβs standard library Psych parser exclusively via TreeHaverβs :psych_backend:
| Backend | Parser | Platform | Notes |
|---|---|---|---|
:psych_backend |
Psych (stdlib) | All Ruby platforms | Built into Ruby, no external dependencies |
π Project Structure
lib/psych/merge/
βββ smart_merger.rb # Main SmartMerger implementation
βββ file_analysis.rb # YAML file analysis (mappings, sequences)
βββ node_wrapper.rb # AST node wrapper for Psych nodes
βββ mapping_entry.rb # Key-value pair representation
βββ merge_result.rb # Merge result object
βββ conflict_resolver.rb # Conflict resolution
βββ freeze_node.rb # Freeze block support
βββ debug_logger.rb # Debug logging
βββ version.rb
spec/psych/merge/
βββ smart_merger_spec.rb
βββ file_analysis_spec.rb
βββ node_wrapper_spec.rb
βββ mapping_entry_spec.rb
βββ integration/
π§ Development Workflows
Running Tests
# Full suite (required for coverage thresholds)
bundle exec rspec
# Single file (disable coverage threshold check)
K_SOUP_COV_MIN_HARD=false bundle exec rspec spec/psych/merge/smart_merger_spec.rb
Note: Always run commands in the project root (/home/pboling/src/kettle-rb/ast-merge/vendor/psych-merge). Allow direnv to load environment variables first by doing a plain cd before running commands.
Coverage Reports
cd /home/pboling/src/kettle-rb/ast-merge/vendor/psych-merge
bin/rake coverage && bin/kettle-soup-cover -d
Key ENV variables (set in .envrc, loaded via direnv allow):
-
K_SOUP_COV_DO=trueβ Enable coverage -
K_SOUP_COV_MIN_LINE=100β Line coverage threshold -
K_SOUP_COV_MIN_BRANCH=82β Branch coverage threshold -
K_SOUP_COV_MIN_HARD=trueβ Fail if thresholds not met
Code Quality
bundle exec rake reek
bundle exec rake rubocop_gradual
π Project Conventions
API Conventions
SmartMerger API
-
mergeβ Returns a String (the merged YAML content) -
merge_resultβ Returns a MergeResult object -
to_son MergeResult returns the merged content as a string
YAML-Specific Features
Mapping Merging:
merger = Psych::Merge::SmartMerger.new(template_yaml, dest_yaml)
result = merger.merge
Freeze Blocks:
database:
# psych-merge:freeze
password: custom_secret # Don't override this
# psych-merge:unfreeze
host: localhost
Anchor/Alias Support:
defaults: &defaults
timeout: 30
retries: 3
production:
<<: *defaults
host: prod.example.com
kettle-dev Tooling
This project uses kettle-dev for gem maintenance automation:
- Rakefile: Sourced from kettle-dev template
- CI Workflows: GitHub Actions and GitLab CI managed via kettle-dev
-
Releases: Use
kettle-releasefor automated release process
Version Requirements
- Ruby >= 3.2.0 (gemspec), developed against Ruby 4.0.1 (
.tool-versions) -
ast-merge>= 4.0.0 required -
tree_haver>= 5.0.3 required -
psych(Ruby stdlib, always available)
π§ͺ Testing Patterns
TreeHaver Dependency Tags
All spec files use TreeHaver RSpec dependency tags for conditional execution:
Available tags:
-
:psych_backendβ Requires Psych backend (always available in Ruby) -
:yaml_parsingβ Requires YAML parser (always available)
β CORRECT β Use dependency tag on describe/context/it:
RSpec.describe Psych::Merge::SmartMerger, :psych_backend do
# Standard pattern even though Psych is always available
end
it "parses YAML", :yaml_parsing do
# Consistent with other *-merge gems
end
β WRONG β Never use manual skip checks:
before do
skip "Requires Psych" unless defined?(Psych) # DO NOT DO THIS
end
Shared Examples
psych-merge uses shared examples from ast-merge:
it_behaves_like "Ast::Merge::FileAnalyzable"
it_behaves_like "Ast::Merge::ConflictResolverBase"
it_behaves_like "a reproducible merge", "scenario_name", { preference: :template }
π Critical Files
| File | Purpose |
|---|---|
lib/psych/merge/smart_merger.rb |
Main YAML SmartMerger implementation |
lib/psych/merge/file_analysis.rb |
YAML file analysis and mapping extraction |
lib/psych/merge/node_wrapper.rb |
Psych node wrapper with YAML-specific methods |
lib/psych/merge/mapping_entry.rb |
Key-value pair abstraction |
lib/psych/merge/debug_logger.rb |
Psych-specific debug logging |
spec/spec_helper.rb |
Test suite entry point |
.envrc |
Coverage thresholds and environment configuration |
π Common Tasks
# Run all specs with coverage
bundle exec rake spec
# Generate coverage report
bundle exec rake coverage
# Check code quality
bundle exec rake reek
bundle exec rake rubocop_gradual
# Prepare and release
kettle-changelog && kettle-release
π Integration Points
-
ast-merge: Inherits base classes (SmartMergerBase,FileAnalyzable, etc.) -
tree_haver: Wraps Psych parser in unified TreeHaver interface -
psych: Rubyβs standard library YAML parser (libyaml binding) -
RSpec: Full integration via
ast/merge/rspecandtree_haver/rspec -
SimpleCov: Coverage tracked for
lib/**/*.rb; spec directory excluded
π‘ Key Insights
- Psych is always available: Itβs part of Ruby stdlib, but we still use TreeHaver for consistency
- MappingEntry abstraction: YAML key-value pairs are wrapped for easier manipulation
- Anchor/alias preservation: Psych AST includes anchors and aliases; we preserve them during merge
-
Comment tracking: Comments are associated with nodes via
CommentTracker -
Freeze blocks use
# psych-merge:freeze: Language-specific comment syntax - Document vs Stream: Psych parses into Stream β Document β Node hierarchy; we handle all levels
- Scalar quoting: Psych provides raw scalar values; quoting style is preserved in source
π« Common Pitfalls
-
NEVER assume all YAML is valid: Use
FileAnalysis#valid?to check parse success -
NEVER use manual skip checks β Use dependency tags (
:psych_backend,:yaml_parsing) - Do NOT forget nil checks: YAML allows null values; handle them explicitly
- Do NOT load vendor gems β They are not part of this project; they do not exist in CI
-
Use
tmp/for temporary files β Never use/tmpor other system directories -
Do NOT chain
cdwith&&β Runcdas a separate command sodirenvloads ENV
π§ YAML-Specific Notes
Node Types in Psych
Psych::Nodes::Stream # Top-level container
Psych::Nodes::Document # YAML document (can have multiple per stream)
Psych::Nodes::Mapping # Key-value pairs (hashes)
Psych::Nodes::Sequence # Arrays/lists
Psych::Nodes::Scalar # Strings, numbers, booleans
Psych::Nodes::Alias # Reference to an anchor
Merge Behavior
- Mappings: Matched by key name; deeply nested mappings are traversed
- Sequences: Can be merged or replaced based on preference
- Scalars: Leaf values; matched by context (parent key)
- Anchors: Preserved; aliases remain valid after merge
- Comments: Preserved when attached to mappings/sequences
- Freeze blocks: Protect customizations from template updates
MappingEntry Structure
entry = Psych::Merge::MappingEntry.new(
key: key_wrapper, # NodeWrapper for key
value: value_wrapper, # NodeWrapper for value
lines: lines,
comment_tracker: tracker
)
entry.key_name # String key name
entry.value_node # Access wrapped value node
entry.start_line # Line number in source