[yocto] bitbake git fetcher aborts during do_unpack with UnicodeDecodeError
Klauer, Daniel
Daniel.Klauer at gin.de
Thu Jan 28 08:06:36 PST 2016
Hello,
we're using Yocto (jethro) with some custom recipes that retrieve source code from Git and use AUTOREV, for example:
SRC_URI = "git://url/project.git;protocol=ssh"
SRCREV = "${AUTOREV}"
Building the image with bitbake works on one machine, but fails on another with an error like this (full error attached):
File: '.../poky/bitbake/lib/bb/fetch2/__init__.py', lineno: 812, function: runfetchcmd
0808:
0809: for var in exportvars:
0810: val = d.getVar(var, True)
0811: if val:
*** 0812: cmd = 'export ' + var + '=\"%s\"; %s' % (val, cmd)
0813:
0814: logger.debug(1, "Running %s", cmd)
0815:
0816: success = False
Exception: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 17: ordinal not in range(128)
It appears that bitbake's git fetcher is prepending shell export commands for certain environment variables (HOME, PATH, but also others - see [1]) to every shell command it runs via runfetchcmd(). Apparently in our case sometimes at least one of these contains non-ASCII bytes (e.g. UTF8 user names).
This by itself is probably ok, but these byte strings are added to the cmd variable, which sometimes is a unicode string - thus causing a decoding to Unicode. It happens because FetchMethod.latest_revision() caches the HEAD revision as a string in an SQL database (bb.persist_data.SQLTable) using the Python sqlite3 module, which returns unicode strings when querying text (by default, see [2]). Then this unicode string variable holding the HEAD rev trickles down to runfetchcmd() where it (sometimes) triggers the UnicodeDecodeError.
Reproducing the issue seems to be as simple as:
1. $ git clone -b jethro git://git.yoctoproject.org/poky.git
2. $ cd poky
3. $ mkdir meta/recipes-support/test
4. create recipe meta/recipes-support/test/testgit.bb:
# just a test recipe
LICENSE = "CLOSED"
SRC_URI = "git://github.com/schacon/simplegit.git;protocol=https"
SRCREV = "${AUTOREV}"
5. $ source oe-init-build-env
6. $ SOCKS5_USER=ü bitbake testgit
I.e. setting one of the environment variables handled by runfetchcmd() to something containing non-ASCII UTF8 bytes, and building a recipe that uses Git and AUTOREV.
Now I'm wondering, how to best solve this problem? I don't have much experience with bitbake or even Python for that matter. The Python sqlite3 module documentation suggests setting text_factory = str to get byte strings instead of unicode strings. It seems to solve the problem here, but I have no idea if it's the right solution:
--- a/bitbake/lib/bb/persist_data.py
+++ b/bitbake/lib/bb/persist_data.py
@@ -201,6 +201,7 @@ class PersistData(object):
def connect(database):
connection = sqlite3.connect(database, timeout=5, isolation_level=None)
connection.execute("pragma synchronous = off;")
+ connection.text_factory = str
return connection
def persist(domain, d):
Best regards,
Daniel Klauer
[1] http://git.yoctoproject.org/cgit/cgit.cgi/poky/tree/bitbake/lib/bb/fetch2/__init__.py?h=jethro&id=2fb7ee2628e23d7efc9b041bb9daae7c4a8de541#n789
[2] https://docs.python.org/2/library/sqlite3.html#sqlite-and-python-types
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: full-bitbake-error.txt
URL: <http://lists.yoctoproject.org/pipermail/yocto/attachments/20160128/b615b4bf/attachment.txt>
More information about the yocto
mailing list