Navigation Bar

Thursday, June 30, 2016

Grouping functions using AWK

The below script is an example of printing the sum of salary employee wise in a file and the number of records in the file for each employee.


emp_sal.txt
EmpName,MonthandYear,Salary
aaaaa,Jan-2016,25000
bbbbb,Jan-2016,30000
cccccccc,Jan-2016,16000
aaaaa,Feb-2016,25000
bbbbb,Feb-2016,30000
cccccccc,Feb-2016,16000

The file contains the names of employees in a firm with monthly salary details. If you want to print sum of salary employee wise and count of records for each employee below is a simple and effective awk statement which will do the task for you

awk -F"," '{arr[$1]=arr[$1]+$3;arr1[$1]++}END {for (a in arr) print a, arr[a],  arr1[a]}' emp_sal.txtr

For every record, "arr" array element sums up the salary component in column 3($3) employee wise (arr[$1]).
"arr1" element collects count of records employee wise (arr1[$1]. ++ increments array counter for each employee name matched.

Many a practical applications have a comma separated file with fields enclosed in double quote ".
In such cases the script has to be tweaked a bit as given below

"EmpName","MonthandYear","Salary"
"aaaaa","Jan-2016","25000"
"bbbbb","Jan-2016","30000"
"cccccccc","Jan-2016","16000"
"aaaaa","Feb-2016","25000"
"bbbbb","Feb-2016","30000"
"cccccccc","Feb-2016","16000"

awk -F"\",\"" 'NR>1{arr[$1]=arr[$1]+$3;arr1[$1]++}END {for (a in arr) print substr(a,2), arr[a],  arr1[a]}' emp_sal.txt
 

Quote for the day
“Whether You Think You Can Or Think You Can’t, You’re Right.”- Henry Ford

No comments:

Post a Comment