[yocto] bitbake git fetcher aborts during do_unpack with UnicodeDecodeError

Klauer, Daniel Daniel.Klauer at gin.de
Thu Jan 28 08:06:36 PST 2016


Hello,

we're using Yocto (jethro) with some custom recipes that retrieve source code from Git and use AUTOREV, for example:

SRC_URI = "git://url/project.git;protocol=ssh"
SRCREV = "${AUTOREV}"

Building the image with bitbake works on one machine, but fails on another with an error like this (full error attached):

File: '.../poky/bitbake/lib/bb/fetch2/__init__.py', lineno: 812, function: runfetchcmd
     0808:
    0809:    for var in exportvars:
     0810:        val = d.getVar(var, True)
     0811:        if val:
 *** 0812:            cmd = 'export ' + var + '=\"%s\"; %s' % (val, cmd)
     0813:
     0814:    logger.debug(1, "Running %s", cmd)
     0815:
     0816:    success = False
Exception: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 17: ordinal not in range(128)

It appears that bitbake's git fetcher is prepending shell export commands for certain environment variables (HOME, PATH, but also others - see [1]) to every shell command it runs via runfetchcmd(). Apparently in our case sometimes at least one of these contains non-ASCII bytes (e.g. UTF8 user names).

This by itself is probably ok, but these byte strings are added to the cmd variable, which sometimes is a unicode string - thus causing a decoding to Unicode. It happens because FetchMethod.latest_revision() caches the HEAD revision as a string in an SQL database (bb.persist_data.SQLTable) using the Python sqlite3 module, which returns unicode strings when querying text (by default, see [2]). Then this unicode string variable holding the HEAD rev trickles down to runfetchcmd() where it (sometimes) triggers the UnicodeDecodeError.


Reproducing the issue seems to be as simple as:

    1. $ git clone -b jethro git://git.yoctoproject.org/poky.git
    2. $ cd poky
    3. $ mkdir meta/recipes-support/test
    4. create recipe meta/recipes-support/test/testgit.bb:
        # just a test recipe
        LICENSE = "CLOSED"
        SRC_URI = "git://github.com/schacon/simplegit.git;protocol=https"
        SRCREV = "${AUTOREV}"
    5. $ source oe-init-build-env
    6. $ SOCKS5_USER=ü bitbake testgit

I.e. setting one of the environment variables handled by runfetchcmd() to something containing non-ASCII UTF8 bytes, and building a recipe that uses Git and AUTOREV.


Now I'm wondering, how to best solve this problem? I don't have much experience with bitbake or even Python for that matter. The Python sqlite3 module documentation suggests setting text_factory = str to get byte strings instead of unicode strings. It seems to solve the problem here, but I have no idea if it's the right solution:

--- a/bitbake/lib/bb/persist_data.py
+++ b/bitbake/lib/bb/persist_data.py
@@ -201,6 +201,7 @@ class PersistData(object):
 def connect(database):
     connection = sqlite3.connect(database, timeout=5, isolation_level=None)
     connection.execute("pragma synchronous = off;")
+    connection.text_factory = str
     return connection
 
 def persist(domain, d):


Best regards,
Daniel Klauer

[1] http://git.yoctoproject.org/cgit/cgit.cgi/poky/tree/bitbake/lib/bb/fetch2/__init__.py?h=jethro&id=2fb7ee2628e23d7efc9b041bb9daae7c4a8de541#n789
[2] https://docs.python.org/2/library/sqlite3.html#sqlite-and-python-types
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: full-bitbake-error.txt
URL: <http://lists.yoctoproject.org/pipermail/yocto/attachments/20160128/b615b4bf/attachment.txt>


More information about the yocto mailing list