Use mruby to evaluate expressions in samtools!
git clone https://github.com/kojix2/samtools-mruby
make
samtools/samtools tanuki # check if it works
__,-─-、__
(〆-─-ヽ)
( ´・ω・` )
/ ,r‐‐‐、ヽ
し l x )J
_.'、 ヽ ノ.人
(_((__,ノU´U. (酒)
Tanuki in mruby (3.3.0)
Rake is required to build mruby.
If you are using conda to install Ruby, set the LD
environment variable:
rake LD=/usr/bin/gcc MRUBY_CONFIG=$(pwd)/mruby_build_config.rb -f $(pwd)/mruby/Rakefile
The samtools-mruby
project allows you to use mruby expressions to manipulate and analyze BAM files. The available variables include:
endpos
: Alignment end position (1-based)flags
: Combined FLAG fieldpaired
,proper_pair
,unmap
,munmap
,reverse
,mreverse
,read1
,read2
,secondary
,qcfail
,dup
,supplementary
- These can be used with or without the
?
suffix, e.g.,paired
andpaired?
are equivalent.
hclen
: Number of hard clipped baseslibrary
: Library (LB header via RG)mapq
: Mapping qualitympos
: Synonym for pnextmrefid
: Mate reference number (0 based)mrname
: Synonym for rnextncigar
: Number of CIGAR operationspnext
: Mate's alignment position (1-based)pos
: Alignment position (1-based)qlen
: Alignment length: no. query basesqname
: Query namequal
: Quality values (raw, 0-based)refid
: Integer reference number (0 based)rlen
: Alignment length: no. reference basesrname
: Reference namernext
: Mate's reference namesclen
: Number of soft clipped basesseq
: Sequencetlen
: Template length (insert size)tag
: XX tag value
These variables enable detailed data manipulation and analysis.
-
Basic Usage: Output read name and sequence in green.
samtools view -E 'puts qname.ljust(13) + seq.green' htslib/test/colons.bam
-
Pattern Highlighting: Use regular expressions.
samtools view -E 'puts qname.ljust(13) + seq.gsub(/CG/, &:red)' htslib/test/colons.bam
-
Flag Methods: Access SAM flag information. Flag methods can be used with or without the
?
suffix. For example,paired
andpaired?
are equivalent.samtools view -E 'puts "#{qname} is paired" if paired?' example.bam
-
Tag Access: Retrieve BAM tags.
samtools view -E 'puts "NM:#{tag("NM")}" if tag("NM")' example.bam
-
Custom Filtering: Use expressions for filtering.
samtools view -E 'puts qname if prpper_pair?' example.bam # samtools view -E 'puts qname if flags & 0x2 != 0' example.bam
- Local: Defined inside expressions, do not persist.
- Global: Defined with
$
, persist across records.
Example to count mapped reads:
samtools view -E '$count ||= 0; $count += 1 unless unmap?; END { puts $count }' example.bam
The samtools-mruby
project integrates mruby
into samtools
, allowing for enhanced functionality through mruby expressions. This integration includes:
- Makefile Modifications: Added support for mruby by including
MRBDIR
,MRB_CPPFLAGS
, andMRB_LDFLAGS
. Updated object files list to includetanuki.o
andsam_view_mruby.o
. - bamtk.c: Introduced a new
tanuki
command, which displays a tanuki using mruby. - sam_view.c: Added support for evaluating mruby expressions with a new
mruby_expr
field in settings. Integrated mruby initialization and finalization. - New Files:
sam_view_mruby.c
: Implements methods for interacting with BAM records using mruby.sam_view_mruby.h
: Header file forsam_view_mruby.c
.tanuki.c
: Contains the implementation for thetanuki
command.
To see changes made to the original samtools repository:
git -C samtools diff origin/develop...origin/mruby
- The
mruby
andhtslib
directories are submodules. samtools
is based on themruby
branch of the kojix2 repository.- The
tanuki
subcommand distinguishes between standard and mruby-enhanced samtools.
Send pull requests to the mruby
branch of my samtools repository.
- MIT License
- This tool was created actively using code generators such as ChatGPT and Copilot.