The best grep tool in the world; ripgrep

19 June 2018   2 comments   Linux, Web development, MacOSX

https://github.com/BurntSushi/ripgrep

tl;dr; ripgrep (aka. rg) is the best tool to grep today.

ripgrep is a tool for searching files. Its killer feature is that it's fast. Like, really really fast. Faster than sift, git grep, ack, regular grep etc.

If you don't believe me, either read this detailed blog post from its author or just jump straight to the conclusion:

Benchmark
Benchmark

I used to use git grep whenever I was inside a git repo and sift for everything else. That alone, was a huge step up from regular grep. Granted, almost all my git repos are small enough that regular git grep is faster than I can perceive many times. But with ripgrep I can just add --no-ignore-vcs and it searches in all the files mentioned in .gitignore too. That's useful when you want to search in your own source as well as the files in node_modules.

The installation instructions are easy. I installed it with brew install ripgrep and the best way to learn how to use it is rg --help . Remember that it has a lot of cool features that are well worth learning. It's written in Rust and so far I haven't had a single crash, ever. The ability to search by file type gets some getting used to (tip! use: rg --type-list) and remember that you can pipe rg output to another rg. For example, to search for all lines that contain query and string you can use rg query | rg string.

Comments

ROGER DELOY PACK

Nooo...you can't say "to grep with..." noooo...

Georgi

>For both searching single files ..., no other tool obviously stands above ripgrep in either performance or correctness.

No, searching literals (i.e. exact matching) with Kazahana is superior, why, because the tool has the fastest memmem() function.
A quick example, on laptop with i7-3630QM:

```
Testfile: 13,113,340,782 OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt

Benchmarking literal (Exact Matching) "Now, Hercules and his little friends won't stand a chance" ...

F:\grep_vs_Kazahana>timer32.exe "Kazahana_Monad_GCC_472_SSE41_32bit.exe" "Now, Hercules and his little friends won't stand a chance" "OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt" 520123

Kernel Time = 3.656 = 66%
User Time = 1.796 = 32%
Process Time = 5.453 = 99% Virtual Memory = 512 MB
Global Time = 5.475 = 100% Physical Memory = 513 MB

F:\grep_vs_Kazahana>timer32.exe "Kazahana_Monad_GCC_730_SSE41_64bit.exe" "Now, Hercules and his little friends won't stand a chance" "OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt" 520123

Kernel Time = 3.718 = 75%
User Time = 1.203 = 24%
Process Time = 4.921 = 99% Virtual Memory = 511 MB
Global Time = 4.940 = 100% Physical Memory = 512 MB

F:\grep_vs_Kazahana>timer32.exe "Kazahana_r1-++fix+nowait_critical_nixFIX_WolfRAM+fixITER+EX+CS_fix_DEFINE_Trolldom_MONAD-Thread_IntelV15_SSE41_64bit.exe" "Now, Hercules and his little friends won't stand a chance" "OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt" 520123

Kernel Time = 3.671 = 74%
User Time = 1.265 = 25%
Process Time = 4.937 = 100% Virtual Memory = 511 MB
Global Time = 4.911 = 100% Physical Memory = 512 MB

F:\grep_vs_Kazahana>timer32.exe "Kazahana_Hexadecad_GCC_730_SSE41_64bit.exe" "Now, Hercules and his little friends won't stand a chance" "OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt" 520123

Kernel Time = 3.718 = 80%
User Time = 5.015 = 108%
Process Time = 8.734 = 188% Virtual Memory = 513 MB
Global Time = 4.627 = 100% Physical Memory = 514 MB

F:\grep_vs_Kazahana>timer32.exe "Kazahana_r1-++fix+nowait_critical_nixFIX_WolfRAM+fixITER+EX+CS_fix_DEFINE_Trolldom_HEXADECAD-Threads_IntelV15_SSE41_64bit.exe" "Now, Hercules and his little friends won't stand a chance" "OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt" 520123

Kernel Time = 7.671 = 137%
User Time = 30.140 = 539%
Process Time = 37.812 = 677% Virtual Memory = 515 MB
Global Time = 5.582 = 100% Physical Memory = 514 MB

F:\grep_vs_Kazahana>set LC_ALL=C

F:\grep_vs_Kazahana>timer32.exe grep.exe -F -c "Now, Hercules and his little friends won't stand a chance" "OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt"
1

Kernel Time = 3.375 = 19%
User Time = 14.375 = 80%
Process Time = 17.750 = 99% Virtual Memory = 2 MB
Global Time = 17.762 = 100% Physical Memory = 5 MB

F:\grep_vs_Kazahana>timer32.exe "ripgrep-11.0.1-x86_64-pc-windows-gnu.exe" -c "Now, Hercules and his little friends won't stand a chance" "OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt"
1

Kernel Time = 2.953 = 42%
User Time = 4.000 = 57%
Process Time = 6.953 = 100% Virtual Memory = 26 MB
Global Time = 6.948 = 100% Physical Memory = 4096 MB
```

Your email will never ever be published

Related posts

Previous:
How to unset aliases set by Oh My Zsh 14 June 2018
Next:
A good Django view function cache decorator for http.JsonResponse 20 June 2018
Related by Keyword:
Rust > Go > Python ...to parse millions of dates in CSV files 15 May 2018
How I found out where a bash alias was set up 09 May 2018
gg - wrapping git-grep 11 August 2009
Redirect stderr into becoming dots in Bash 02 September 2006
Grep results expanded 23 April 2005