• Uncategorized

About python : run-selenium-with-crontab-python

Question Detail

I have a python script that calls chrome via selenium with the next line.

ff = webdriver.Chrome(‘/home/user01/webScraping/CollectAndGo/chromedriver’)

The python script is called from a shell script.

python /home/user01/webScraping/CollectAndGo/cgcom.py > /home/user01/webScraping/CollectAndGo/cgcom.log 2>&1

When I run the script from the terminal or just executing the .sh file it works perfectly but when I schedule a crontab job it fail with the next error.

raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: u’unknown error: Chrome failed to start: exited abnormally\n (Driver info: chromedriver=2.9.248304,platform=Linux 3.5.0-36-generic x86_64)’

The error is related to the first line of code of this question. Does anybody know why this could be happening?

Question Answer

The most evident problem with trying to launch a browser from cron is that even if you have X running on your machine, the DISPLAY environment variable is not set for processes running from your crontab so launching a browser from there will fail.

Solutions range from the trivial to the super sophisticated. A trivial solution would be to accept that your script won’t run if there is no X running and manually set DISPLAY to :0, which is the default display number for the default X server that Ubuntu starts.

For instance, if I put this command in the command column of a crontab line, Chrome starts without issue:

DISPLAY=:0 google-chrome

The complete line in the a user-specific crontab file would be something like:

0 * * * * DISPLAY=:0 google-chrome

If you want to run a python script that starts chrome through selenium, the line would instead look like:

0 * * * * DISPLAY=:0 python my_script.py

The command string is just sent as-is to the shell so in the last example the string DISPLAY=:0 python my_script.py would be just passed to the shell. It is common shell syntax to interpret a variable assignment given immediately at the start of the command as setting an environment variable. (It is certainly the case for dash and bash, one of which is likely to be the default shell in most installations.) So the command that the shell interprets sets the environment variable DISPLAY to the value :0 and then runs python my_script.py. Since python inherits its environment from the shell that started it, the variable DISPLAY is :0 for it too.

Setting DISPLAY=:0 like I show above sets the variable only for the command that follows. It is also possible to set DISPLAY to :0 for all commands executed by the crontab. For instance in the following user-specific crontab:

DISPLAY=:0

30 * * * * google-chrome
0 * * * * python my_script.py

the line DISPLAY=:0 sets the environment variable DISPLAY both for the execution of google-chrome and python my_script.py
……………………………………………………
on MacOS Catalina only this command worked for me

* * * * * export DISPLAY=:0 && export PATH=$PATH:/usr/local/bin && /usr/bin/python3 ~/Documents/Scripts/my_script.py

……………………………………………………
Use pyvirtualdisplay and Xvfb to manage your window session for you (originally from this answer)
Background:
In my case, the accepted answer is not working.
Solution:

Install PyVirtualDisplay and Xvfb

pip3 install pyvirtualdisplay
sudo apt-get install xvfb

Assign window handler in your .py script

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from pyvirtualdisplay import Display
import time

# Display creates a virtual frame buffer and manages it for you
with Display(visible=False, size=(1200, 1500)):
driver = webdriver.Firefox()
driver.get(“https://website-target.com”)

time.sleep(1)

print(driver.current_url) # check connection

time.sleep(1)

print(driver.current_url)

driver.close()

……………………………………………………
selenium web drivers needs X session for running script. Cron scripts normally runs with out X session. Add X session in your cron script. Like as follows:
* 11 * * * export DISPLAY=:0; your script.py
……………………………………………………
Crontab is likely running as a user that doesn’t have permission to access the chromedriver directory/file.

Take a look at the answers here on how to run crontab as a specific user.

You may also like...

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.