About string : How-to-check-if-a-string-contains-a-substring-in-Bash
Question Detail
I have a string in Bash:
string=”My string”
How can I test if it contains another string?
if [ $string ?? ‘foo’ ]; then
echo “It’s there!”
fi
Where ?? is my unknown operator. Do I use echo and grep?
if echo “$string” | grep ‘foo’; then
echo “It’s there!”
fi
That looks a bit clumsy.
Question Answer
You can use Marcus’s answer (* wildcards) outside a case statement, too, if you use double brackets:
string=’My long string’
if [[ $string == *”My long”* ]]; then
echo “It’s there!”
fi
Note that spaces in the needle string need to be placed between double quotes, and the * wildcards should be outside. Also note that a simple comparison operator is used (i.e. ==), not the regex operator =~.
……………………………………………………
If you prefer the regex approach:
string=’My string’;
if [[ $string =~ “My” ]]; then
echo “It’s there!”
fi
……………………………………………………
I am not sure about using an if statement, but you can get a similar effect with a case statement:
case “$string” in
*foo*)
# Do stuff
;;
esac
……………………………………………………
stringContain variants (compatible or case independent)
As these Stack Overflow answers tell mostly about Bash, I’ve posted a case independent Bash function at the very bottom of this post…
Anyway, there is my
Compatible answer
As there are already a lot of answers using Bash-specific features, there is a way working under poorer-featured shells, like BusyBox:
[ -z “${string##*$reqsubstr*}” ]
In practice, this could give:
string=’echo “My string”‘
for reqsubstr in ‘o “M’ ‘alt’ ‘str’;do
if [ -z “${string##*$reqsubstr*}” ] ;then
echo “String ‘$string’ contain substring: ‘$reqsubstr’.”
else
echo “String ‘$string’ don’t contain substring: ‘$reqsubstr’.”
fi
done
This was tested under Bash, Dash, KornShell (ksh) and ash (BusyBox), and the result is always:
String ‘echo “My string”‘ contain substring: ‘o “M’.
String ‘echo “My string”‘ don’t contain substring: ‘alt’.
String ‘echo “My string”‘ contain substring: ‘str’.
Into one function
As asked by @EeroAaltonen here is a version of the same demo, tested under the same shells:
myfunc() {
reqsubstr=”$1″
shift
string=”[email protected]”
if [ -z “${string##*$reqsubstr*}” ] ;then
echo “String ‘$string’ contain substring: ‘$reqsubstr’.”;
else
echo “String ‘$string’ don’t contain substring: ‘$reqsubstr’.”
fi
}
Then:
$ myfunc ‘o “M’ ‘echo “My String”‘
String ‘echo “My String”‘ contain substring ‘o “M’.
$ myfunc ‘alt’ ‘echo “My String”‘
String ‘echo “My String”‘ don’t contain substring ‘alt’.
Notice: you have to escape or double enclose quotes and/or double quotes:
$ myfunc ‘o “M’ echo “My String”
String ‘echo My String’ don’t contain substring: ‘o “M’.
$ myfunc ‘o “M’ echo \”My String\”
String ‘echo “My String”‘ contain substring: ‘o “M’.
Simple function
This was tested under BusyBox, Dash, and, of course Bash:
stringContain() { [ -z “${2##*$1*}” ]; }
Then now:
$ if stringContain ‘o “M3’ ‘echo “My String”‘;then echo yes;else echo no;fi
no
$ if stringContain ‘o “M’ ‘echo “My String”‘;then echo yes;else echo no;fi
yes
… Or if the submitted string could be empty, as pointed out by @Sjlver, the function would become:
stringContain() { [ -z “${2##*$1*}” ] && [ -z “$1” -o -n “$2” ]; }
or as suggested by Adrian Günter’s comment, avoiding -o switches:
stringContain() { [ -z “${2##*$1*}” ] && { [ -z “$1” ] || [ -n “$2” ];};}
Final (simple) function:
And inverting the tests to make them potentially quicker:
stringContain() { [ -z “$1” ] || { [ -z “${2##*$1*}” ] && [ -n “$2″ ];};}
With empty strings:
$ if stringContain ” ”; then echo yes; else echo no; fi
yes
$ if stringContain ‘o “M’ ”; then echo yes; else echo no; fi
no
Case independent (Bash only!)
For testing strings without care of case, simply convert each string to lower case:
stringContain() {
local _lc=${2,,}
[ -z “$1” ] || { [ -z “${_lc##*${1,,}*}” ] && [ -n “$2” ] ;} ;}
Check:
stringContain ‘o “M3’ ‘echo “my string”‘ && echo yes || echo no
no
stringContain ‘o “My’ ‘echo “my string”‘ && echo yes || echo no
yes
if stringContain ” ”; then echo yes; else echo no; fi
yes
if stringContain ‘o “M’ ”; then echo yes; else echo no; fi
no
……………………………………………………
You should remember that shell scripting is less of a language and more of a collection of commands. Instinctively you think that this “language” requires you to follow an if with a [ or a [[. Both of those are just commands that return an exit status indicating success or failure (just like every other command). For that reason I’d use grep, and not the [ command.
Just do:
if grep -q foo <<<"$string"; then
echo "It's there"
fi
Now that you are thinking of if as testing the exit status of the command that follows it (complete with semi-colon), why not reconsider the source of the string you are testing?
## Instead of this
filetype="$(file -b "$1")"
if grep -q "tar archive" <<<"$filetype"; then
#...
## Simply do this
if file -b "$1" | grep -q "tar archive"; then
#...
The -q option makes grep not output anything, as we only want the return code. <<< makes the shell expand the next word and use it as the input to the command, a one-line version of the << here document (I'm not sure whether this is standard or a Bashism).
............................................................
The accepted answer is best, but since there's more than one way to do it, here's another solution:
if [ "$string" != "${string/foo/}" ]; then
echo "It's there!"
fi
${var/search/replace} is $var with the first instance of search replaced by replace, if it is found (it doesn't change $var). If you try to replace foo by nothing, and the string has changed, then obviously foo was found.
............................................................
So there are lots of useful solutions to the question - but which is fastest / uses the fewest resources?
Repeated tests using this frame:
/usr/bin/time bash -c 'a=two;b=onetwothree; x=100000; while [ $x -gt 0 ]; do TEST ; x=$(($x-1)); done'
Replacing TEST each time:
[[ $b =~ $a ]] 2.92 user 0.06 system 0:02.99 elapsed 99% CPU
[ "${b/$a//}" = "$b" ] 3.16 user 0.07 system 0:03.25 elapsed 99% CPU
[[ $b == *$a* ]] 1.85 user 0.04 system 0:01.90 elapsed 99% CPU
case $b in *$a):;;esac 1.80 user 0.02 system 0:01.83 elapsed 99% CPU
doContain $a $b 4.27 user 0.11 system 0:04.41 elapsed 99%CPU
(doContain was in F. Houri's answer)
And for giggles:
echo $b|grep -q $a 12.68 user 30.86 system 3:42.40 elapsed 19% CPU !ouch!
So the simple substitution option predictably wins whether in an extended test or a case. The case is portable.
Piping out to 100000 greps is predictably painful! The old rule about using external utilities without need holds true.
............................................................
Bash 4+ examples. Note: not using quotes will cause issues when words contain spaces, etc. Always quote in Bash, IMO.
Here are some examples Bash 4+:
Example 1, check for 'yes' in string (case insensitive):
if [[ "${str,,}" == *"yes"* ]] ;then
Example 2, check for 'yes' in string (case insensitive):
if [[ "$(echo "$str" | tr '[:upper:]' '[:lower:]')" == *"yes"* ]] ;then
Example 3, check for 'yes' in string (case sensitive):
if [[ "${str}" == *"yes"* ]] ;then
Example 4, check for 'yes' in string (case sensitive):
if [[ "${str}" =~ "yes" ]] ;then
Example 5, exact match (case sensitive):
if [[ "${str}" == "yes" ]] ;then
Example 6, exact match (case insensitive):
if [[ "${str,,}" == "yes" ]] ;then
Example 7, exact match:
if [ "$a" = "$b" ] ;then
Example 8, wildcard match .ext (case insensitive):
if echo "$a" | egrep -iq "\.(mp[3-4]|txt|css|jpg|png)" ; then
Example 9, use grep on a string case sensitive:
if echo "SomeString" | grep -q "String"; then
Example 10, use grep on a string case insensitive:
if echo "SomeString" | grep -iq "string"; then
Example 11, use grep on a string case insensitive w/ wildcard:
if echo "SomeString" | grep -iq "Some.*ing"; then
Enjoy.
............................................................
This also works:
if printf -- '%s' "$haystack" | egrep -q -- "$needle"
then
printf "Found needle in haystack"
fi
And the negative test is:
if ! printf -- '%s' "$haystack" | egrep -q -- "$needle"
then
echo "Did not find needle in haystack"
fi
I suppose this style is a bit more classic -- less dependent upon features of Bash shell.
The -- argument is pure POSIX paranoia, used to protected against input strings similar to options, such as --abc or -a.
Note: In a tight loop this code will be much slower than using internal Bash shell features, as one (or two) separate processes will be created and connected via pipes.
............................................................
As Paul mentioned in his performance comparison:
if echo "abcdefg" | grep -q "bcdef"; then
echo "String contains is true."
else
echo "String contains is not true."
fi
This is POSIX compliant like the 'case "$string" in' the answer provided by Marcus, but it is slightly easier to read than the case statement answer. Also note that this will be much much slower than using a case statement. As Paul pointed out, don't use it in a loop.
............................................................
How about this:
text="
if [[ “$text” =~ “
echo “matched”
else
echo “not matched”
fi
……………………………………………………
[[ $string == *foo* ]] && echo “It’s there” || echo “Couldn’t find”
……………………………………………………
This Stack Overflow answer was the only one to trap space and dash characters:
# For null cmd arguments checking
to_check=’ -t’
space_n_dash_chars=’ -‘
[[ $to_check == *”$space_n_dash_chars”* ]] && echo found
……………………………………………………
Accepted answer is correct but it is hard to read and understand.
For problems related to searching you should always use the $needle in a $haystack idiom.
Since its suggested edit queue is full, I post this:
haystack=’There are needles here.’
if [[ “$haystack” == *”needle”* ]]; then
echo “It’s there!”
fi
……………………………………………………
One is:
[ $(expr $mystring : “.*${search}.*”) -ne 0 ] && echo ‘yes’ || echo ‘no’
……………………………………………………
Since the POSIX/BusyBox question is closed without providing the right answer (IMHO), I’ll post an answer here.
The shortest possible answer is:
[ ${_string_##*$_substring_*} ] || echo Substring found!
or
[ “${_string_##*$_substring_*}” ] || echo ‘Substring found!’
Note that the double hash is obligatory with some shells (ash). Above will evaluate [ stringvalue ] when the substring is not found. It returns no error. When the substring is found the result is empty and it evaluates [ ]. This will throw error code 1 since the string is completely substituted (due to *).
The shortest more common syntax:
[ -z “${_string_##*$_substring_*}” ] && echo ‘Substring found!’
or
[ -n “${_string_##*$_substring_*}” ] || echo ‘Substring found!’
Another one:
[ “${_string_##$_substring_}” != “$_string_” ] && echo ‘Substring found!’
or
[ “${_string_##$_substring_}” = “$_string_” ] || echo ‘Substring found!’
Note the single equal sign!
……………………………………………………
My .bash_profile file and how I used grep:
If the PATH environment variable includes my two bin directories, don’t append them,
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
U=~/.local.bin:~/bin
if ! echo “$PATH” | grep -q “home”; then
export PATH=$PATH:${U}
fi
……………………………………………………
Extension of the question answered here How do you tell if a string contains another string in POSIX sh?:
This solution works with special characters:
# contains(string, substring)
#
# Returns 0 if the specified string contains the specified substring,
# otherwise returns 1.
contains() {
string=”$1″
substring=”$2″
if echo “$string” | $(type -p ggrep grep | head -1) -F — “$substring” >/dev/null; then
return 0 # $substring is in $string
else
return 1 # $substring is not in $string
fi
}
contains “abcd” “e” || echo “abcd does not contain e”
contains “abcd” “ab” && echo “abcd contains ab”
contains “abcd” “bc” && echo “abcd contains bc”
contains “abcd” “cd” && echo “abcd contains cd”
contains “abcd” “abcd” && echo “abcd contains abcd”
contains “” “” && echo “empty string contains empty string”
contains “a” “” && echo “a contains empty string”
contains “” “a” || echo “empty string does not contain a”
contains “abcd efgh” “cd ef” && echo “abcd efgh contains cd ef”
contains “abcd efgh” ” ” && echo “abcd efgh contains a space”
contains “abcd [efg] hij” “[efg]” && echo “abcd [efg] hij contains [efg]”
contains “abcd [efg] hij” “[effg]” || echo “abcd [efg] hij does not contain [effg]”
contains “abcd *efg* hij” “*efg*” && echo “abcd *efg* hij contains *efg*”
contains “abcd *efg* hij” “d *efg* h” && echo “abcd *efg* hij contains d *efg* h”
contains “abcd *efg* hij” “*effg*” || echo “abcd *efg* hij does not contain *effg*”
……………………………………………………
grep -q is useful for this purpose.
The same using awk:
string=”unix-bash 2389″
character=”@”
printf ‘%s’ “$string” | awk -vc=”$character” ‘{ if (gsub(c, “”)) { print “Found” } else { print “Not Found” } }’
Output:
Not Found
string=”unix-bash 2389″
character=”-”
printf ‘%s’ “$string” | awk -vc=”$character” ‘{ if (gsub(c, “”)) { print “Found” } else { print “Not Found” } }’
Output:
Found
Original source: http://unstableme.blogspot.com/2008/06/bash-search-letter-in-string-awk.html
……………………………………………………
I like sed.
substr=”foo”
nonsub=”$(echo “$string” | sed “s/$substr//”)”
hassub=0 ; [ “$string” != “$nonsub” ] && hassub=1
Edit, Logic:
Use sed to remove instance of substring from string
If new string differs from old string, substring exists
……………………………………………………
I found to need this functionality quite frequently, so I’m using a home-made shell function in my .bashrc like this which allows me to reuse it as often as I need to, with an easy to remember name:
function stringinstring()
{
case “$2” in
*”$1″*)
return 0
;;
esac
return 1
}
To test if $string1 (say, abc) is contained in $string2 (say, 123abcABC) I just need to run stringinstring “$string1” “$string2” and check for the return value, for example
stringinstring “$str1” “$str2″ && echo YES || echo NO
……………………………………………………
The generic needle haystack example is following with variables
#!/bin/bash
needle=”a_needle”
haystack=”a_needle another_needle a_third_needle”
if [[ $haystack == *”$needle”* ]]; then
echo “needle found”
else
echo “needle NOT found”
fi
……………………………………………………
case $string in (*foo*)
# Do stuff
esac
This is the same answer as https://stackoverflow.com/a/229585/11267590. But simple style and also POSIX Compliant.
……………………………………………………
Exact word match:
string=’My long string’
exactSearch=’long’
if grep -E -q “\b${exactSearch}\b” <<<${string} >/dev/null 2>&1
then
echo “It’s there”
fi
……………………………………………………
Try oobash.
It is an OO-style string library for Bash 4. It has support for German umlauts. It is written in Bash.
Many functions are available: -base64Decode, -base64Encode, -capitalize, -center, -charAt, -concat, -contains, -count, -endsWith, -equals, -equalsIgnoreCase, -reverse, -hashCode, -indexOf, -isAlnum, -isAlpha, -isAscii, -isDigit, -isEmpty, -isHexDigit, -isLowerCase, -isSpace, -isPrintable, -isUpperCase, -isVisible, -lastIndexOf, -length, -matches, -replaceAll, -replaceFirst, -startsWith, -substring, -swapCase, -toLowerCase, -toString, -toUpperCase, -trim, and -zfill.
Look at the contains example:
[Desktop]$ String a testXccc
[Desktop]$ a.contains tX
true
[Desktop]$ a.contains XtX
false
oobash is available at Sourceforge.net.
……………………………………………………
I use this function (one dependency not included but obvious). It passes the tests shown below. If the function returns a value > 0 then the string was found. You could just as easily return 1 or 0 instead.
function str_instr {
# Return position of “`str“` within “`string“`.
# >>> str_instr “str” “string”
# str: String to search for.
# string: String to search.
typeset str string x
# Behavior here is not the same in bash vs ksh unless we escape special characters.
str=”$(str_escape_special_characters “${1}”)”
string=”${2}”
x=”${string%%$str*}”
if [[ “${x}” != “${string}” ]]; then
echo “${#x} + 1” | bc -l
else
echo 0
fi
}
function test_str_instr {
str_instr “(” “‘[email protected] (dev,web)'” | assert_eq 11
str_instr “)” “‘[email protected] (dev,web)'” | assert_eq 19
str_instr “[” “‘[email protected] [dev,web]'” | assert_eq 11
str_instr “]” “‘[email protected] [dev,web]'” | assert_eq 19
str_instr “a” “abc” | assert_eq 1
str_instr “z” “abc” | assert_eq 0
str_instr “Eggs” “Green Eggs And Ham” | assert_eq 7
str_instr “a” “” | assert_eq 0
str_instr “” “” | assert_eq 0
str_instr ” ” “Green Eggs” | assert_eq 6
str_instr ” ” ” Green ” | assert_eq 1
}
……………………………………………………
You can use a logic && to be more compact
#!/bin/bash
# NO MATCH EXAMPLE
string=”test”
[[ “$string” == *”foo”* ]] && {
echo “YES”
}
# MATCH EXAMPLE
string=”tefoost”
[[ “$string” == *”foo”* ]] && {
echo “YES”
}
……………………………………………………
msg=”message”
function check {
echo $msg | egrep [abc] 1> /dev/null
if [ $? -ne 1 ];
then
echo “found”
else
echo “not found”
fi
}
check
This will find any occurance of a or b or c
……………………………………………………
With jq:
string=’My long string’
echo $string | jq -Rr ‘select(contains(“long”))|”It is there”‘
The hardest thing in jq is to print the single quote:
echo $string | jq –arg quote “‘” -Rr ‘select(contains(“long”))|”It\($quote)s there”‘
Using jq just to check the condition:
if jq -Re ‘select(contains(“long”))|halt’ <<< $string; then
echo "It's there!"
fi