• Uncategorized

About linux : Handling-special-characters-in-bash-script

Question Detail

I’m not familiar with bash scripting. Maybe this is a silly question. But I couldn’t find the answer. I’m working on a bash script that mimics the behavior of the command ls -sh but that actually uses du -sh to get file and folder sizes. And it sorts the output. pretty much like du -sh* | sort -h with colors.

#!/usr/bin/bash

if [ "$#" = "0" ]
then
    du -sh *|awk -f /path/to/color-ls.awk|sort -h
else
    du -sh [email protected]|awk -f /path/to/color-ls.awk|sort -h
fi

where ls-color.awk is:

# color-ls.awk
size=$1;
name=$2;
for (i=3; i<=NF; i++)
{
    tmp=(name " " $i);
    name=tmp
}
# filename=($0 ~ /'/)? ("\"" name "\""):("'" name "'")
filename=("'" name "'")
printf $1 " "
cmd=("ls -d " filename " --color")
system(cmd)

an awk script that uses ls --color to color the output of du -sh

My scripts works fine with most file names even ones containing spaces. but it has some problems involving special characters that I didn’t know how to fix.

1. When run without arguments:

It is interpreting any file name that contains single quotes causing an error

sh: 1: Syntax error: Unterminated quoted string

2. When run with arguments:

The same problem as without arguments. And it’s interpreting a file name with spaces as two names.

example: when used on a folder named VirtualBox VMs or when given * as an argument in my home directory here’s it’s output:

du: cannot access 'VirtualBox': No such file or directory
du: cannot access 'VMs': No such file or directory

3. What I want:

I want the script to skip special characters and pass them as they are to du

4. What I tried:

I tried adding double quotes before and after each file name

parse(){
    for arg in [email protected]
    do
        printf "\"$arg\"\n"
    done
}

but it didn’t seem to work. du doesn’t accept quotes appended to the file name.

du: cannot access '"VirtualBox': No such file or directory
du: cannot access 'VMs"': No such file or directory

Also, replacing quotes with \' doesn’t help ether. maybe I’m just doing it wrong.

# du -sh $(printf "file'name\n" |sed "s/'/\\\'/g")
du: cannot access 'file\'\''name': No such file or directory
# ls file\'name 
"file'name"

Same goes for spaces

du: cannot access 'VirtualBox\': No such file or directory
du: cannot access 'VMs': No such file or directory

5. Extra:

I’m trying to make the script works as normal ls -sh would work but with sorted output and with more accurate results when it comes to folders. but this script works like ls -sh -d when arguments are supplied to it. making lh Desktop shows the size of Desktop instead of the size of the individual files and folders inside Desktop. I believe this can be fixed with a loop that checks if each argument is a file or a folder and execute du -sh accordingly then sort.

#!/usr/bin/bash

if [ "$#" = "0" ]
then
    du -sh *|awk -f /path/to/color-ls.awk|sort -h
else
    for i in [email protected]
    do
        if [[ -d "$i" ]]; then
            du -sh $i/* |awk -f /path/to/color-ls.awk
        else
            du -sh "$i" |awk -f /path/to/color-ls.awk
        fi
    done|sort -h
fi

I’m hoping to find the optimal way to do it.

Thanks in advance.

Question Answer

Please do not post so much in one question. Please one problem per question. One script per question, etc.

Make sure to check your scripts with shellcheck. It will catch your mistakes. See https://mywiki.wooledge.org/Quotes .

  1. When run without arguments:

filename=("'" name "'") inside awk script is a invalid way to pass anything with ' quotes to system() call, so you are getting unterminated ' error, as expected, because there will be 3 ' characters. Fix the AWS script, or better rewrite it in Bash, no need for awk. Maybe rewrite it all in Python or Perl.

Moreover, tmp=(name " " $i); deletes tabs and multiple spaces from filenames. It’s all meant to work with only nice filenames.

The script will break on newlines in filenames anyway.

  1. When run with arguments:

[email protected] undergoes word splitting and filename expansion (topics you should research). Word splitting splits the input into words on spaces. Use "[email protected]". Quote the expansions.

  1. What I want:

You’ll be doing that with "[email protected]"

  1. What I tried:

The variable content is irrelevant. You have to change the way you use the variable, not it’s content. I.e. use quotes around the use of the variable. Not the content.

  1. Extra:

You did not quote the expansion. Use "$i" not $i. It’s "$i"/*. $1 undergoes word splitting.


And finally, after that all, your script may look like, with GNU tools:

if (($# == 0)); then
   set -- *
fi
du -hs0 "[email protected]" |
sort -zh |
sed -z 's/\t/\x00/' |
while IFS= read -r -d '' size && IFS= read -r -d '' file; do
   printf "%s " "$size";
   ls -d "$file"
done

Also see How can I find and safely handle file names containing newlines, spaces or both? https://mywiki.wooledge.org/BashFAQ/001 .

Also, you can chain any statements:

if stuff; then
   stuff1
else
   stuff2
fi | 
sort -h |
awk -f yourscriptrt 

And also don’t repeat yourself – use bash arrays:

args=()
if stuff; then
  args=(*)
else
  args=("[email protected]")
fi
du -hs "${args[@]}" | stuff...

And so that sort has less work to do, I would put it right after du, not after parsing.

Since you didn’t include shopt -s nullglob, it’s likely that Desktop/* didn’t expand to any file which is odd unless there really are no files there, you have enabled nullglob in interactive mode, and du -sh doesn’t actually display the sizes of the files in Desktop.

It’s also likely that you’re calling the script from where Desktop/ doesn’t exist.

You can add a debug statement which prints $PWD. You can also try running the script with bash -x.

In your script I suggest enabling nullglob and then modifying it so du -sh isn’t called if target directory contains no files.

Something like:

set -- "$i"/*; [[ $# -gt 0 ]] && du -sh -- "[email protected]" ...

Also [email protected] should be quoted when being expanded.

for i in "[email protected]"; do

This can be simplified to for i; do, but we will modify the positional parameters inside the loop so we expand "[email protected]" instead.

You can also choose to store the expanded files inside an array as well.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.