• Uncategorized

About linux : Merging-multiple-csv-files-into-one-based-on-filename

Question Detail

I am trying to develop a bash script which merge multiple .csv file into one. This script does the job for me:-

awk '
    FNR == 1 && NR != 1 {next} # skip all headers except the first one
    /^,*$/ {next}              # skip all empty CSV rows
    !seen[$0]++                # print uniq records
' *.csv > demo_merged.csv

Except that I’m trying to filter files based on their filenames. I have hundred of files with the following format:- demo_YYYYMMDD.csv

demo_20220301.csv
demo_20220306.csv
demo_20220211.csv
demo_20220321.csv
demo_20220101.csv
demo_20220331.csv
demo_20220301_1.csv
demo_20220301_2.csv

Here I need to filter out all the files for March_2022 (i.e. only those files whose names are demo_202203DD.csv ) and merge it into one .csv file named as demo.20220331_merged.csv ( the date should be the last date for the month of march).

Basically the output file should only contain the following files only:

demo_20220301.csv
demo_20220306.csv
demo_20220321.csv
demo_20220331.csv

What should I add in the script for this filter?
Thanks in advance!

Question Answer

Instead of *.csv just use demo_202203??.csv – that glob pattern will give you all files for March minus the ones with _1 or _2.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.