• Uncategorized

About string : How-to-check-if-a-string-contains-a-substring-in-Bash

Question Detail

I have a string in Bash:
string=”My string”

How can I test if it contains another string?
if [ $string ?? ‘foo’ ]; then
echo “It’s there!”
fi

Where ?? is my unknown operator. Do I use echo and grep?
if echo “$string” | grep ‘foo’; then
echo “It’s there!”
fi

That looks a bit clumsy.

Question Answer

You can use Marcus’s answer (* wildcards) outside a case statement, too, if you use double brackets:

string=’My long string’
if [[ $string == *”My long”* ]]; then
echo “It’s there!”
fi

Note that spaces in the needle string need to be placed between double quotes, and the * wildcards should be outside. Also note that a simple comparison operator is used (i.e. ==), not the regex operator =~.
……………………………………………………
If you prefer the regex approach:
string=’My string’;

if [[ $string =~ “My” ]]; then
echo “It’s there!”
fi

……………………………………………………
I am not sure about using an if statement, but you can get a similar effect with a case statement:

case “$string” in
*foo*)
# Do stuff
;;
esac

……………………………………………………
stringContain variants (compatible or case independent)

As these Stack Overflow answers tell mostly about Bash, I’ve posted a case independent Bash function at the very bottom of this post…

Anyway, there is my

Compatible answer

As there are already a lot of answers using Bash-specific features, there is a way working under poorer-featured shells, like BusyBox:

[ -z “${string##*$reqsubstr*}” ]

In practice, this could give:

string=’echo “My string”‘
for reqsubstr in ‘o “M’ ‘alt’ ‘str’;do
if [ -z “${string##*$reqsubstr*}” ] ;then
echo “String ‘$string’ contain substring: ‘$reqsubstr’.”
else
echo “String ‘$string’ don’t contain substring: ‘$reqsubstr’.”
fi
done

This was tested under Bash, Dash, KornShell (ksh) and ash (BusyBox), and the result is always:

String ‘echo “My string”‘ contain substring: ‘o “M’.
String ‘echo “My string”‘ don’t contain substring: ‘alt’.
String ‘echo “My string”‘ contain substring: ‘str’.

Into one function

As asked by @EeroAaltonen here is a version of the same demo, tested under the same shells:

myfunc() {
reqsubstr=”$1″
shift
string=”[email protected]
if [ -z “${string##*$reqsubstr*}” ] ;then
echo “String ‘$string’ contain substring: ‘$reqsubstr’.”;
else
echo “String ‘$string’ don’t contain substring: ‘$reqsubstr’.”
fi
}

Then:

$ myfunc ‘o “M’ ‘echo “My String”‘
String ‘echo “My String”‘ contain substring ‘o “M’.

$ myfunc ‘alt’ ‘echo “My String”‘
String ‘echo “My String”‘ don’t contain substring ‘alt’.

Notice: you have to escape or double enclose quotes and/or double quotes:

$ myfunc ‘o “M’ echo “My String”
String ‘echo My String’ don’t contain substring: ‘o “M’.

$ myfunc ‘o “M’ echo \”My String\”
String ‘echo “My String”‘ contain substring: ‘o “M’.

Simple function

This was tested under BusyBox, Dash, and, of course Bash:

stringContain() { [ -z “${2##*$1*}” ]; }

Then now:

$ if stringContain ‘o “M3’ ‘echo “My String”‘;then echo yes;else echo no;fi
no
$ if stringContain ‘o “M’ ‘echo “My String”‘;then echo yes;else echo no;fi
yes

… Or if the submitted string could be empty, as pointed out by @Sjlver, the function would become:

stringContain() { [ -z “${2##*$1*}” ] && [ -z “$1” -o -n “$2” ]; }

or as suggested by Adrian Günter’s comment, avoiding -o switches:

stringContain() { [ -z “${2##*$1*}” ] && { [ -z “$1” ] || [ -n “$2” ];};}

Final (simple) function:

And inverting the tests to make them potentially quicker:

stringContain() { [ -z “$1” ] || { [ -z “${2##*$1*}” ] && [ -n “$2″ ];};}

With empty strings:

$ if stringContain ” ”; then echo yes; else echo no; fi
yes
$ if stringContain ‘o “M’ ”; then echo yes; else echo no; fi
no

Case independent (Bash only!)

For testing strings without care of case, simply convert each string to lower case:

stringContain() {
local _lc=${2,,}
[ -z “$1” ] || { [ -z “${_lc##*${1,,}*}” ] && [ -n “$2” ] ;} ;}

Check:

stringContain ‘o “M3’ ‘echo “my string”‘ && echo yes || echo no
no
stringContain ‘o “My’ ‘echo “my string”‘ && echo yes || echo no
yes
if stringContain ” ”; then echo yes; else echo no; fi
yes
if stringContain ‘o “M’ ”; then echo yes; else echo no; fi
no

……………………………………………………
You should remember that shell scripting is less of a language and more of a collection of commands. Instinctively you think that this “language” requires you to follow an if with a [ or a [[. Both of those are just commands that return an exit status indicating success or failure (just like every other command). For that reason I’d use grep, and not the [ command.

Just do:

if grep -q foo <<<"$string"; then echo "It's there" fi Now that you are thinking of if as testing the exit status of the command that follows it (complete with semi-colon), why not reconsider the source of the string you are testing? ## Instead of this filetype="$(file -b "$1")" if grep -q "tar archive" <<<"$filetype"; then #... ## Simply do this if file -b "$1" | grep -q "tar archive"; then #... The -q option makes grep not output anything, as we only want the return code. <<< makes the shell expand the next word and use it as the input to the command, a one-line version of the << here document (I'm not sure whether this is standard or a Bashism). ............................................................ The accepted answer is best, but since there's more than one way to do it, here's another solution: if [ "$string" != "${string/foo/}" ]; then echo "It's there!" fi ${var/search/replace} is $var with the first instance of search replaced by replace, if it is found (it doesn't change $var). If you try to replace foo by nothing, and the string has changed, then obviously foo was found. ............................................................ So there are lots of useful solutions to the question - but which is fastest / uses the fewest resources? Repeated tests using this frame: /usr/bin/time bash -c 'a=two;b=onetwothree; x=100000; while [ $x -gt 0 ]; do TEST ; x=$(($x-1)); done' Replacing TEST each time: [[ $b =~ $a ]] 2.92 user 0.06 system 0:02.99 elapsed 99% CPU [ "${b/$a//}" = "$b" ] 3.16 user 0.07 system 0:03.25 elapsed 99% CPU [[ $b == *$a* ]] 1.85 user 0.04 system 0:01.90 elapsed 99% CPU case $b in *$a):;;esac 1.80 user 0.02 system 0:01.83 elapsed 99% CPU doContain $a $b 4.27 user 0.11 system 0:04.41 elapsed 99%CPU (doContain was in F. Houri's answer) And for giggles: echo $b|grep -q $a 12.68 user 30.86 system 3:42.40 elapsed 19% CPU !ouch! So the simple substitution option predictably wins whether in an extended test or a case. The case is portable. Piping out to 100000 greps is predictably painful! The old rule about using external utilities without need holds true. ............................................................ Bash 4+ examples. Note: not using quotes will cause issues when words contain spaces, etc. Always quote in Bash, IMO. Here are some examples Bash 4+: Example 1, check for 'yes' in string (case insensitive): if [[ "${str,,}" == *"yes"* ]] ;then Example 2, check for 'yes' in string (case insensitive): if [[ "$(echo "$str" | tr '[:upper:]' '[:lower:]')" == *"yes"* ]] ;then Example 3, check for 'yes' in string (case sensitive): if [[ "${str}" == *"yes"* ]] ;then Example 4, check for 'yes' in string (case sensitive): if [[ "${str}" =~ "yes" ]] ;then Example 5, exact match (case sensitive): if [[ "${str}" == "yes" ]] ;then Example 6, exact match (case insensitive): if [[ "${str,,}" == "yes" ]] ;then Example 7, exact match: if [ "$a" = "$b" ] ;then Example 8, wildcard match .ext (case insensitive): if echo "$a" | egrep -iq "\.(mp[3-4]|txt|css|jpg|png)" ; then Example 9, use grep on a string case sensitive: if echo "SomeString" | grep -q "String"; then Example 10, use grep on a string case insensitive: if echo "SomeString" | grep -iq "string"; then Example 11, use grep on a string case insensitive w/ wildcard: if echo "SomeString" | grep -iq "Some.*ing"; then Enjoy. ............................................................ This also works: if printf -- '%s' "$haystack" | egrep -q -- "$needle" then printf "Found needle in haystack" fi And the negative test is: if ! printf -- '%s' "$haystack" | egrep -q -- "$needle" then echo "Did not find needle in haystack" fi I suppose this style is a bit more classic -- less dependent upon features of Bash shell. The -- argument is pure POSIX paranoia, used to protected against input strings similar to options, such as --abc or -a. Note: In a tight loop this code will be much slower than using internal Bash shell features, as one (or two) separate processes will be created and connected via pipes. ............................................................ As Paul mentioned in his performance comparison: if echo "abcdefg" | grep -q "bcdef"; then echo "String contains is true." else echo "String contains is not true." fi This is POSIX compliant like the 'case "$string" in' the answer provided by Marcus, but it is slightly easier to read than the case statement answer. Also note that this will be much much slower than using a case statement. As Paul pointed out, don't use it in a loop. ............................................................ How about this: text=" bmnmn
if [[ “$text” =~ “” ]]; then
echo “matched”
else
echo “not matched”
fi

……………………………………………………
[[ $string == *foo* ]] && echo “It’s there” || echo “Couldn’t find”

……………………………………………………
This Stack Overflow answer was the only one to trap space and dash characters:

# For null cmd arguments checking
to_check=’ -t’
space_n_dash_chars=’ -‘
[[ $to_check == *”$space_n_dash_chars”* ]] && echo found

……………………………………………………
Accepted answer is correct but it is hard to read and understand.
For problems related to searching you should always use the $needle in a $haystack idiom.
Since its suggested edit queue is full, I post this:
haystack=’There are needles here.’
if [[ “$haystack” == *”needle”* ]]; then
echo “It’s there!”
fi

……………………………………………………
One is:

[ $(expr $mystring : “.*${search}.*”) -ne 0 ] && echo ‘yes’ || echo ‘no’

……………………………………………………
Since the POSIX/BusyBox question is closed without providing the right answer (IMHO), I’ll post an answer here.

The shortest possible answer is:

[ ${_string_##*$_substring_*} ] || echo Substring found!

or

[ “${_string_##*$_substring_*}” ] || echo ‘Substring found!’

Note that the double hash is obligatory with some shells (ash). Above will evaluate [ stringvalue ] when the substring is not found. It returns no error. When the substring is found the result is empty and it evaluates [ ]. This will throw error code 1 since the string is completely substituted (due to *).

The shortest more common syntax:

[ -z “${_string_##*$_substring_*}” ] && echo ‘Substring found!’

or

[ -n “${_string_##*$_substring_*}” ] || echo ‘Substring found!’

Another one:

[ “${_string_##$_substring_}” != “$_string_” ] && echo ‘Substring found!’

or

[ “${_string_##$_substring_}” = “$_string_” ] || echo ‘Substring found!’

Note the single equal sign!
……………………………………………………
My .bash_profile file and how I used grep:

If the PATH environment variable includes my two bin directories, don’t append them,

# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

U=~/.local.bin:~/bin

if ! echo “$PATH” | grep -q “home”; then
export PATH=$PATH:${U}
fi

……………………………………………………
Extension of the question answered here How do you tell if a string contains another string in POSIX sh?:

This solution works with special characters:

# contains(string, substring)
#
# Returns 0 if the specified string contains the specified substring,
# otherwise returns 1.
contains() {
string=”$1″
substring=”$2″

if echo “$string” | $(type -p ggrep grep | head -1) -F — “$substring” >/dev/null; then
return 0 # $substring is in $string
else
return 1 # $substring is not in $string
fi
}

contains “abcd” “e” || echo “abcd does not contain e”
contains “abcd” “ab” && echo “abcd contains ab”
contains “abcd” “bc” && echo “abcd contains bc”
contains “abcd” “cd” && echo “abcd contains cd”
contains “abcd” “abcd” && echo “abcd contains abcd”
contains “” “” && echo “empty string contains empty string”
contains “a” “” && echo “a contains empty string”
contains “” “a” || echo “empty string does not contain a”
contains “abcd efgh” “cd ef” && echo “abcd efgh contains cd ef”
contains “abcd efgh” ” ” && echo “abcd efgh contains a space”

contains “abcd [efg] hij” “[efg]” && echo “abcd [efg] hij contains [efg]”
contains “abcd [efg] hij” “[effg]” || echo “abcd [efg] hij does not contain [effg]”

contains “abcd *efg* hij” “*efg*” && echo “abcd *efg* hij contains *efg*”
contains “abcd *efg* hij” “d *efg* h” && echo “abcd *efg* hij contains d *efg* h”
contains “abcd *efg* hij” “*effg*” || echo “abcd *efg* hij does not contain *effg*”

……………………………………………………
grep -q is useful for this purpose.

The same using awk:

string=”unix-bash 2389″
character=”@”
printf ‘%s’ “$string” | awk -vc=”$character” ‘{ if (gsub(c, “”)) { print “Found” } else { print “Not Found” } }’

Output:

Not Found

string=”unix-bash 2389″
character=”-”
printf ‘%s’ “$string” | awk -vc=”$character” ‘{ if (gsub(c, “”)) { print “Found” } else { print “Not Found” } }’

Output:

Found

Original source: http://unstableme.blogspot.com/2008/06/bash-search-letter-in-string-awk.html
……………………………………………………
I like sed.

substr=”foo”
nonsub=”$(echo “$string” | sed “s/$substr//”)”
hassub=0 ; [ “$string” != “$nonsub” ] && hassub=1

Edit, Logic:

Use sed to remove instance of substring from string
If new string differs from old string, substring exists

……………………………………………………
I found to need this functionality quite frequently, so I’m using a home-made shell function in my .bashrc like this which allows me to reuse it as often as I need to, with an easy to remember name:

function stringinstring()
{
case “$2” in
*”$1″*)
return 0
;;
esac
return 1
}

To test if $string1 (say, abc) is contained in $string2 (say, 123abcABC) I just need to run stringinstring “$string1” “$string2” and check for the return value, for example

stringinstring “$str1” “$str2″ && echo YES || echo NO

……………………………………………………
The generic needle haystack example is following with variables

#!/bin/bash

needle=”a_needle”
haystack=”a_needle another_needle a_third_needle”
if [[ $haystack == *”$needle”* ]]; then
echo “needle found”
else
echo “needle NOT found”
fi

……………………………………………………
case $string in (*foo*)
# Do stuff
esac

This is the same answer as https://stackoverflow.com/a/229585/11267590. But simple style and also POSIX Compliant.
……………………………………………………
Exact word match:

string=’My long string’
exactSearch=’long’

if grep -E -q “\b${exactSearch}\b” <<<${string} >/dev/null 2>&1
then
echo “It’s there”
fi

……………………………………………………
Try oobash.

It is an OO-style string library for Bash 4. It has support for German umlauts. It is written in Bash.

Many functions are available: -base64Decode, -base64Encode, -capitalize, -center, -charAt, -concat, -contains, -count, -endsWith, -equals, -equalsIgnoreCase, -reverse, -hashCode, -indexOf, -isAlnum, -isAlpha, -isAscii, -isDigit, -isEmpty, -isHexDigit, -isLowerCase, -isSpace, -isPrintable, -isUpperCase, -isVisible, -lastIndexOf, -length, -matches, -replaceAll, -replaceFirst, -startsWith, -substring, -swapCase, -toLowerCase, -toString, -toUpperCase, -trim, and -zfill.

Look at the contains example:

[Desktop]$ String a testXccc
[Desktop]$ a.contains tX
true
[Desktop]$ a.contains XtX
false

oobash is available at Sourceforge.net.
……………………………………………………
I use this function (one dependency not included but obvious). It passes the tests shown below. If the function returns a value > 0 then the string was found. You could just as easily return 1 or 0 instead.

function str_instr {
# Return position of “`str“` within “`string“`.
# >>> str_instr “str” “string”
# str: String to search for.
# string: String to search.
typeset str string x
# Behavior here is not the same in bash vs ksh unless we escape special characters.
str=”$(str_escape_special_characters “${1}”)”
string=”${2}”
x=”${string%%$str*}”
if [[ “${x}” != “${string}” ]]; then
echo “${#x} + 1” | bc -l
else
echo 0
fi
}

function test_str_instr {
str_instr “(” “‘[email protected] (dev,web)'” | assert_eq 11
str_instr “)” “‘[email protected] (dev,web)'” | assert_eq 19
str_instr “[” “‘[email protected] [dev,web]'” | assert_eq 11
str_instr “]” “‘[email protected] [dev,web]'” | assert_eq 19
str_instr “a” “abc” | assert_eq 1
str_instr “z” “abc” | assert_eq 0
str_instr “Eggs” “Green Eggs And Ham” | assert_eq 7
str_instr “a” “” | assert_eq 0
str_instr “” “” | assert_eq 0
str_instr ” ” “Green Eggs” | assert_eq 6
str_instr ” ” ” Green ” | assert_eq 1
}

……………………………………………………
You can use a logic && to be more compact
#!/bin/bash

# NO MATCH EXAMPLE
string=”test”
[[ “$string” == *”foo”* ]] && {
echo “YES”
}

# MATCH EXAMPLE
string=”tefoost”
[[ “$string” == *”foo”* ]] && {
echo “YES”
}

……………………………………………………
msg=”message”

function check {
echo $msg | egrep [abc] 1> /dev/null

if [ $? -ne 1 ];
then
echo “found”
else
echo “not found”
fi
}

check

This will find any occurance of a or b or c
……………………………………………………
With jq:
string=’My long string’
echo $string | jq -Rr ‘select(contains(“long”))|”It is there”‘

The hardest thing in jq is to print the single quote:
echo $string | jq –arg quote “‘” -Rr ‘select(contains(“long”))|”It\($quote)s there”‘

Using jq just to check the condition:
if jq -Re ‘select(contains(“long”))|halt’ <<< $string; then echo "It's there!" fi

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.