During a recent incident response, it was necessary to take a very large text file (70,000 lines, about 25,000 printed pages), query and redact information to pass to the incident response team. With some command line processing I was able to redact personal identifiers (MAC address, and Username) except for the ones in question.
I have included the process below. While it can certainly be improved, it is a prime example of the kinds of tasks that may be required on short notice during an incident response.
It is incredibly important to test, retest, double retest your code and resultant information to ensure you do not change the integrity of the original logs and/or fail to redact appropriately (which failure may violate several federal compliance’s). Also make sure you provide a copy of your code and process for independent review to the incident response team.
# #!/bin/bash
# Code to query and redact file called Preservation.txt
#search logs for 1200 as a string for 2 specific dates and send to a file
cat Preservation.txt | grep 1200 | grep -e 2019-02-06 -e 2019-02-07 > search1200.txt
# search logs for 2400 as a string for 2 specific dates and send to a file
cat Preservation.txt | grep 2400 | grep 2019-04- > search2400.txt
# search logs for 3100 as a string for 3 specific dates and send to a file
cat Preservation.txt | grep 3100 | grep -e 2019-02-06 -e 2019-02-07 -e 2019-04 > search3100.txt
#replace the first.lastname of interest in all three files with value that will be ignored during the redaction
sed -i -E 's/User\[first.lastname\]/xUser/g' search*
#redact all usernames from the 3 files
sed -i -E 's/User\[(.*?)\]/User[*******]/g' search*
#return the first.lastname of interest in all 3 files
sed -i -E 's/xUser/User\[first.lastname\]/g' search*
#replace the 3 MAC Addresses of interest in all three files with value that will be ignored during the redaction
sed -i -E 's/MAC\[AA:AA:AA:AA:AA:AA\]/xMAC/g' search*
sed -i -E 's/MAC\[BB:BB:BB:BB:BB:BB\]/x2MAC/g' search*
sed -i -E 's/MAC\[CC:CC:CC:CC:CC:CC\]/x3MAC/g' search*
#redact all MAC addresses except for the last 4 digits
sed -i -E 's/MAC\[[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+/MAC[*:*:*:*/g' search*
#return the 3 Mac Addresses of interest in all three files
sed -i -E 's/xMAC/MAC\[AA:AA:AA:AA:AA:AA\]/g' search*
sed -i -E 's/x2MAC/MAC\[BB:BB:BB:BB:BB:BB\]/g' search*
sed -i -E 's/x3MAC/MAC\[CC:CC:CC:CC:CC:CC\]/g' search*