🤓 Let's talk about sorting in AWK 🤯
As you might know, arrays in AWK are backed by hashtables. This means traversing them (eg. via `for (i in arr)`) does not return values in any particular order.
💡How can we still ensure the output is sorted (in GNU AWK)?
A thread.. ⬇️
🤓 How to join two files? 🧑🎓
Say we have a list withs ids and their names and one with ids and their ages. How to join them by id?
> awk 'ARGIND==1{ages[$1]=$2} ARGIND==2{print $2, ages[$1]}' ages.txt names.txt
anne 56
bob 23
chad 11
🤯❓ --> Thread ⬇️
#Linux#programming
📚 Weekend Read 📚
Many years ago, @WardCunningham has written a beautiful AWK script to calculate the expenses of their skip trip.
https://t.co/N8K6BVDFyx
💡 This is AWK at its best.
👩🎨 How can we substitute strings in files? 👨🎨
💡 Use sub:
> awk '{sub("ch", "xx")} 1' file.txt
➡️ Replaces the string "ch" with "xx"
➡️ But only the first occurrence
➡️ gsub replaces multiple
🤔 But how does this work? ⬇️
#Linux#AWK#programming
💡If you are like me, you cannot remember yesterday's FPAT.
Put this into your ~/.bashrc, ~/.zshrc (etc..):
alias csvawk='awk -v FPAT="([^,]*)|(\"[^\"]+\")"'
After reloading:
> csvawk '{print $3}' file.txt
"Berlin, Germany"
You're welcome 😜
#Linux#AWK
👩🎨Parsing CSV with (GNU) AWK👨🎨
Simple CSVs can be handled with the field separator: `awk -F,`
🤔What about something like this?
"anna",8,"Berlin, Germany"
💡FPAT to the rescue!
> awk -v FPAT='([^,]*)|("[^"]+")' '{print $3}' file.txt
"Berlin, Germany"
#Linux#AWK#GNU
👩🎓 What if all data is on one line? 🤔
Eg. 'anna=10;bob=5;chad=7;danny=3'
Can we still use #AWK?
Of course! We just pretend that ';' is the newline separator 🤓
💡awk -v RS=';' -F= '{print $1}'
➡️ RS is the row separator. Usually this is \n
#Linux#codingisfun
🤓 Day 2 of #AdventOfCode screams for #AWK 😱
But how can we do it in practice?
💡 the field separator accepts a regex, this allows us to split eg. "1-3 a: abcde" into fields
💡 the builtin function `gsub` returns the number of matches
#AdventOfCodeSpoilers#Linux
😇 Enter trick 77: 'x=1{}x': works 📈
🤔Knowing this, we can rewrite the original solution to:
'BEGIN{FS="[- :]"}_=gsub($3,"",$5){}_>=$1&&_<=$2{x++}END{print x}'
This reduces the number of characters from 73 to 65. Can you come up with a shorter one? 🤓
🤔 Change a space-separated file to comma-separated:
> awk -v OFS=, '{$1=$1}1' file.txt
➡️OFS modifies the output field separator from a space (default) to a comma.
> awk -v OFS=, '{$1=$1; print $0}' file.txt
➡️same, but more explicit.
#linux#cli#awk#programming
🤓 How to print unique lines by specific field in #AWK 🤓
> awk '!_[$3]++' file.txt
➡️ Prints all lines where field 3 was not seen before
🤔Magic? No! Read on ⬇️
#Linux#CLI#DevOps
🤓 You can write any number of patterns and blocks.
🤓 Every pattern-block combo is run on each line
> awk '$1 == "anna" { print $2 } $2 == 5 { print $1 }' file.txt
➡️ Prints field 2 where field 1 is "anna" and also prints field 1 where field 2 is 4
#linux#awk#devops
🤓 AWK has a lot of defaults, which makes it hard to read unless you know them.
> awk '{print $0}' file.txt
➡️ no pattern: print every line
> awk '1' file.txt
➡️if there is only a pattern, the default is to print
#AWK#Linux#shell#magic#cli