• Uncategorized

About string : How-do-I-remove-the-file-suffix-and-path-portion-from-a-path-string-in-Bash

Question Detail

Given a string file path such as /foo/fizzbuzz.bar, how would I use bash to extract just the fizzbuzz portion of said string?

Question Answer

Here’s how to do it with the # and % operators in Bash.

$ x="/foo/fizzbuzz.bar"
$ y=${x%.bar}
$ echo ${y##*/}
fizzbuzz

${x%.bar} could also be ${x%.*} to remove everything after a dot or ${x%%.*} to remove everything after the first dot.

Example:

$ x="/foo/fizzbuzz.bar.quux"
$ y=${x%.*}
$ echo $y
/foo/fizzbuzz.bar
$ y=${x%%.*}
$ echo $y
/foo/fizzbuzz

Documentation can be found in the Bash manual. Look for ${parameter%word} and ${parameter%%word} trailing portion matching section.

look at the basename command:

NAME="$(basename /foo/fizzbuzz.bar .bar)"

instructs it to remove the suffix .bar, results in NAME=fizzbuzz

Pure bash, done in two separate operations:

  1. Remove the path from a path-string:

    path=/foo/bar/bim/baz/file.gif
    
    file=${path##*/}  
    #$file is now 'file.gif'
    
  2. Remove the extension from a path-string:

    base=${file%.*}
    #${base} is now 'file'.
    

Using basename I used the following to achieve this:

for file in *; do
    ext=${file##*.}
    fname=`basename $file $ext`

    # Do things with $fname
done;

This requires no a priori knowledge of the file extension and works even when you have a filename that has dots in it’s filename (in front of it’s extension); it does require the program basename though, but this is part of the GNU coreutils so it should ship with any distro.

The basename and dirname functions are what you’re after:

mystring=/foo/fizzbuzz.bar
echo basename: $(basename "${mystring}")
echo basename + remove .bar: $(basename "${mystring}" .bar)
echo dirname: $(dirname "${mystring}")

Has output:

basename: fizzbuzz.bar
basename + remove .bar: fizzbuzz
dirname: /foo

Pure bash way:

~$ x="/foo/bar/fizzbuzz.bar.quux.zoom"; 
~$ y=${x/\/*\//}; 
~$ echo ${y/.*/}; 
fizzbuzz

This functionality is explained on man bash under “Parameter Expansion”. Non bash ways abound: awk, perl, sed and so on.

EDIT: Works with dots in file suffixes and doesn’t need to know the suffix (extension), but doesn’t work with dots in the name itself.

Using basename assumes that you know what the file extension is, doesn’t it?

And I believe that the various regular expression suggestions don’t cope with a filename containing more than one “.”

The following seems to cope with double dots. Oh, and filenames that contain a “/” themselves (just for kicks)

To paraphrase Pascal, “Sorry this script is so long. I didn’t have time to make it shorter”


  #!/usr/bin/perl
  $fullname = $ARGV[0];
  ($path,$name) = $fullname =~ /^(.*[^\\]\/)*(.*)$/;
  ($basename,$extension) = $name =~ /^(.*)(\.[^.]*)$/;
  print $basename . "\n";
 

In addition to the POSIX conformant syntax used in this answer,

basename string [suffix]

as in

basename /foo/fizzbuzz.bar .bar

GNU basename supports another syntax:

basename -s .bar /foo/fizzbuzz.bar

with the same result. The difference and advantage is that -s implies -a, which supports multiple arguments:

$ basename -s .bar /foo/fizzbuzz.bar /baz/foobar.bar
fizzbuzz
foobar

This can even be made filename-safe by separating the output with NUL bytes using the -z option, for example for these files containing blanks, newlines and glob characters (quoted by ls):

$ ls has*
'has'$'\n''newline.bar'  'has space.bar'  'has*.bar'

Reading into an array:

$ readarray -d $'\0' arr < <(basename -zs .bar has*)
$ declare -p arr
declare -a arr=([0]=$'has\nnewline' [1]="has space" [2]="has*")

readarray -d requires Bash 4.4 or newer. For older versions, we have to loop:

while IFS= read -r -d '' fname; do arr+=("$fname"); done < <(basename -zs .bar has*)

perl -pe 's/\..*$//;s{^.*/}{}'

If you can’t use basename as suggested in other posts, you can always use sed. Here is an (ugly) example. It isn’t the greatest, but it works by extracting the wanted string and replacing the input with the wanted string.

echo '/foo/fizzbuzz.bar' | sed 's|.*\/\([^\.]*\)\(\..*\)$|\1|g'

Which will get you the output

fizzbuzz

Beware of the suggested perl solution: it removes anything after the first dot.

$ echo some.file.with.dots | perl -pe 's/\..*$//;s{^.*/}{}'
some

If you want to do it with perl, this works:

$ echo some.file.with.dots | perl -pe 's/(.*)\..*$/$1/;s{^.*/}{}'
some.file.with

But if you are using Bash, the solutions with y=${x%.*} (or basename "$x" .ext if you know the extension) are much simpler.

The basename does that, removes the path. It will also remove the suffix if given and if it matches the suffix of the file but you would need to know the suffix to give to the command. Otherwise you can use mv and figure out what the new name should be some other way.

Combining the top-rated answer with the second-top-rated answer to get the filename without the full path:

$ x="/foo/fizzbuzz.bar.quux"
$ y=(`basename ${x%%.*}`)
$ echo $y
fizzbuzz

You can use

mv *<PATTERN>.jar "$(basename *<PATTERN>.jar <PATTERN>.jar).jar"

For e.g:- I wanted to remove -SNAPSHOT from my file name. For that used below command

 mv *-SNAPSHOT.jar "$(basename *-SNAPSHOT.jar -SNAPSHOT.jar).jar"

You may also like...

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.