January 4, 2024

Sum values per group (of a text/data file) from the command line with Datamash

Install Datamash

We will use Datamash (copyleft-licensed) to sum values per group given a text (data) file. You might need to download it from the repository of the distribution you use. It was not installed on the distribution I use.

Sum values per group

Once you have Datamash installed.

This is the content of the file we want to process:

$ cat example.txt
amount name
02 Bob
12 Bob
13 Alice
03 Bob
43 Alice

To sum all amounts by name, run:

$ datamash -WHs -g name sum amount < example.txt 

The above command tells Datamash to sum values in the amount field by name. The options passed instruct Datamash to use white spaces as field delimiters, use headers as command input and print those, and sort the input before grouping.

The command should print:

GroupBy(name)   sum(amount)
Alice   56
Bob 17

See $ man datamash for more examples.

Troubleshooting

Run $ datamash check < file.txt if you have issues with printing the expecting output.

$ datamash check < example.txt 
6 lines, 2 fields

Concluding note

You can also do pivot table with Datamash by using the option crosstab.

Datamash was written by Assaf Gordon.


personal computing command-line interface (cli) gnu linux trisquel shell literacy office applications wiki text processing datamash spreadsheet stuff in the terminal offline

No affiliate links, no analytics, no tracking, no cookies. This work © 2016-2024 by yctct is licensed under CC BY-ND 4.0 .   about me   contact me   all entries & tags   FAQ   GPG public key

GPG fingerprint: 2E0F FB60 7FEF 11D0 FB45 4DDC E979 E52A 7036 7A88