Labels

Sunday 23 November 2014

Debian on Nexus 7: Booting Debian

In this post, I'm going to attempt to boot the Nexus 7 into a Debian environment. It won't be able to do very much, but it should at least boot up OK.

The User Data Partition


As discussed in an earlier post, I've decided to use the tablet's user data partition (of 11GB or so in size) to hold the root filesystem. From Android (with adb shell), I can work out exactly what partition this is on the tablet's internal storage:
shell@flo:/data $ mount
...
/dev/block/platform/msm_sdcc.1/by-name/system /system ext4 ...
/dev/block/platform/msm_sdcc.1/by-name/cache /cache ext4 ...
/dev/block/platform/msm_sdcc.1/by-name/userdata /data ext4 ...
/dev/block/platform/msm_sdcc.1/by-name/persist /persist ext4 ...
...
shell@flo:/data $ ls -l /dev/block/platform/msm_sdcc.1/by-name/userdata
lrwxrwxrwx root     root              2014-09-01 21:45 userdata -> /dev/block/mmcblk0p30
So the user data partition is mmcblk0p30, under /dev/block in Android (this corresponds to /dev/mmcblk0p30 in my minimal ramdisk).

I already know that I need to use the make_ext4fs tool to build a sparse image file to package the root filesystem for flashing onto the Nexus 7 using fastboot. I fetched the AOSP sources for this tool as follows (using the commit points where the build worked for me):
$ cd ~/Android/Nexus7-Debian
$ mkdir -p platform/{system,external}
$ cd platform/system
$ git clone https://android.googlesource.com/platform/system/core
$ git clone https://android.googlesource.com/platform/system/extras
$ cd core
$ git reset --hard 39ab11d
$ cd ../extras
$ git reset --hard 5fb7c3e
$ cd ../../external
$ git clone https://android.googlesource.com/platform/external/libselinux
$ cd libselinux
$ git reset --hard f76c30b
and (after examining how the Android.mk files did things) built the tool with the following:
$ sudo apt-get install zlib1g-dev
$ cd ~/Android/Nexus7-Debian/platform
$ gcc -o make_ext4fs -Iexternal/libselinux/include -Isystem/core/include -Isystem/core/libsparse/include -DHOST \
    system/extras/ext4_utils/{make_ext4fs.c,ext4fixup.c,ext4_utils.c,allocate.c,contents.c,extent.c,indirect.c,uuid.c,sha1.c,wipe.c,crc16.c,ext4_sb.c,make_ext4fs_main.c} \
    external/libselinux/src/{callbacks.c,check_context.c,freecon.c,init.c,label.c,label_file.c,label_android_property.c} \
    system/core/libsparse/{backed_block.c,output_file.c,sparse.c,sparse_crc32.c,sparse_err.c,sparse_read.c} -lz
$ ls
external  make_ext4fs  system
I then ensured that the make_ext4fs binary was copied to a suitable location on my path. Next, I need to try creating a test sparse image file to see if I can really flash the user data partition.

I'll quickly create a dummy root filesystem (with nothing particularly useful inside it):
$ mkdir /tmp/fstest
$ touch /tmp/fstest/IT_WORKS
and package it into a sparse image file:
$ cd ~/Android/Nexus7-Debian/out
$ make_ext4fs -s -l 11770M fstest.img /tmp/fstest
$ ls
boot.img  fstest.img
The -s argument specifies that a sparse image should be produced, and the -l argument specifies the image size. The meaning of these arguments were deduced from platform/system/extras/ext4_utils/mkuserimg.sh in the AOSP. The value of 11,770MB was found by experimentation, comparing the result with the user data image file from the KTU84P Android factory image.

So, now I have my test user data image (alongside my minimal ramdisk boot image), I can try this out on the tablet. I can use fastboot to flash the user data image in the usual way:
$ fastboot flash userdata fstest.img
erasing 'userdata'...
OKAY [  1.314s]
sending 'userdata' (136890 KB)...
OKAY [  4.292s]
writing 'userdata'...
OKAY [  5.797s]
finished. total time: 11.404s
And I'll boot and login to the minimal ramdisk as usual, and try mounting the user data partition:
$ fastboot boot boot.img
downloading 'boot.img'...
OKAY [  0.256s]
booting...
OKAY [  0.026s]
finished. total time: 0.282s
$ telnet 192.168.1.1
Trying 192.168.1.1...
Connected to 192.168.1.1.
Escape character is '^]'.

nexus-7-initrd login: root
~ # mount /dev/mmcblk0p30 /mnt
~ # ls /mnt
IT_WORKS    lost+found
 ~ #
Looks like it worked! Checking the partition sizes:
~ # df -h
Filesystem                Size      Used Available Use% Mounted on
none                    894.7M         0    894.7M   0% /dev
none                    903.9M         0    903.9M   0% /tmp
none                    903.9M      4.0K    903.9M   0% /var
/dev/mmcblk0p30          11.3G    129.4M     11.2G   1% /mnt
It appears I only have 11.3GB for this image (as opposed to 11.9GB shown by Android for its user data partition), but this should be good enough. I'm now ready to create a Debian root filesystem.

Creating a Debian Filesystem


I'll go for a Debian 7 ("wheezy") filesystem, which I will build using multistrap (using the information from this page). This requires the following packages to be installed:
$ sudo apt-get install multistrap binfmt-support qemu qemu-user-static
I did find that I was hit by this bug, which I had to work around by removing the reference to $forceeyes on line 989 of /usr/sbin/multistrap before I could continue. I first created a multistrap configuration:
$ cd ~/Android/Nexus7-Debian
$ mkdir debian
$ cd debian
$ gvim multistrap.conf
which I populated with the following:
[General]
noauth=true
unpack=true
debootstrap=Wheezy
aptsources=Wheezy
arch=armel
directory=./rootfs

[Wheezy]
packages=udev netbase net-tools ifupdown iputils-ping isc-dhcp-server isc-dhcp-client inetutils-inetd telnetd apt module-init-tools procps unzip sudo wget dialog libncurses5-dev vim
source=http://ftp.uk.debian.org/debian/
keyring=debian-archive-keyring
components=main contrib non-free
suite=wheezy
This initial set of packages should be enough to replicate the functionality that I have with the minimal ramdisk (plus a couple of other things, like vim). I'll most likely need to add more packages to this initial set in time, to get enough functionality to allow the tablet to fetch subsequent packages itself over some network. Any dependency requirements that these packages have are automatically fulfilled. With my configuration ready, I kicked off multistrap with:
$ sudo multistrap -f ./multistrap.conf
...
Multistrap system installed successfully in /home/simon/Android/Nexus7-Debian/debian/rootfs/.
Now I have to configure the system appropriately before I can boot it with the tablet. To do this, I can enter a chroot shell and use qemu to perform transparent user emulation for any ARM binaries that I need to invoke:
$ sudo cp /usr/bin/qemu-arm-static rootfs/usr/bin
$ sudo mount -t proc proc rootfs/proc
$ sudo chroot ./rootfs /bin/bash
I have no name!@my-pc:/#
I can now perform the necessary post-install package setup routines:
I have no name!@my-pc:/# /var/lib/dpkg/info/dash.preinst install
I have no name!@my-pc:/# dpkg --configure -a
Some packages may complain about not being able to start things (because of the chroot environment), but this should be OK. Now to lay down the configuration. I first edited /etc/inittab and commented out the following lines (disabling getty invocations for virtual consoles which are not being used):
1:2345:respawn:/sbin/getty 38400 tty1
2:23:respawn:/sbin/getty 38400 tty2
3:23:respawn:/sbin/getty 38400 tty3
4:23:respawn:/sbin/getty 38400 tty4
5:23:respawn:/sbin/getty 38400 tty5
6:23:respawn:/sbin/getty 38400 tty6
Next, I created /etc/fstab with the following contents:
# file system     mount point     type    options                 dump    pass
proc              /proc           proc    nodev,noexec,nosuid     0       0
sysfs             /sys            sysfs   nodev,noexec,nosuid     0       0
/dev/mmcblk0p30   /               ext4    errors=remount-ro       0       1
and created /etc/hostname with an appropriate hostname:
nexus-7
I then set a static IP address for the USB RNDIS network interface by adding the following to /etc/network/interfaces:
auto usb0
iface usb0 inet static
address 192.168.1.1
netmask 255.255.255.0
and configured the DHCP server to use this interface by setting the INTERFACES variable in /etc/default/isc-dhcp-server to:
INTERFACES="usb0"
I also set an appropriate DHCP configuration by adding the following to /etc/dhcp/dhcpd.conf (as well as commenting out the enabled option lines):
subnet 192.168.1.0 netmask 255.255.255.0 {
  range 192.168.1.20 192.168.1.254;
}
To use the telnet daemon with inetd, I created /etc/inetd.conf with the following contents:
telnet stream tcp nowait root /usr/sbin/in.telnetd telnetd
All that remains is to set an appropriate root password, and add a normal user:
I have no name!@my-pc:/# passwd root
I have no name!@my-pc:/# adduser simon
I can also reduce the size of the initial image by 100MB or so, by clearing apt's cache:
I have no name!@my-pc:/# apt-get clean
The system should now be ready, so I can leave the chroot environment and tidy up. On my system, syslogd appeared to have been started during the post-install process, so I had to stop that too (and clean up its /dev/xconsole FIFO pipe).
I have no name!@my-pc:/# pkill syslogd
I have no name!@my-pc:/# rm -f /dev/xconsole
I have no name!@my-pc:/# exit
exit
$ sudo umount rootfs/proc
I can now package this up into a sparse image, and deploy this to the tablet:
$ cd ~/Android/Nexus7-Debian/out
$ sudo bash -c "PATH=$PATH make_ext4fs -s -l 11770M debian-rootfs.img ../debian/rootfs"
$ fastboot flash userdata debian-rootfs.img

Reconfiguring the Ramdisk


I also need to make a minor change to my minimal ramdisk image, to allow it to boot Debian. This is accomplished by having it run a shell script as init, rather than using BusyBox's own init implementation. This shell script performs the necessary steps to mount the user data partition and transfer the boot process over to the init executable contained there. The script can also perform some tablet-specific tasks, such as enabling USB RNDIS.
$ cd ~/Android/Nexus7-Debian/initrd/rootfs
$ sudo rm init
$ sudo gvim init
I created the following shell script to act as init (after seeing the example on this page):
#!/bin/sh

# Mount core filesystems
mount -t proc none /proc
mount -t sysfs none /sys  

# Blink the white LED to show that we're booting
echo 255 > /sys/class/leds/white/brightness
echo 1 > /sys/class/leds/white/device/blink

# Tickle the Android composite gadget driver into RNDIS mode
echo 0 > /sys/class/android_usb/android0/enable
echo 18d1 > /sys/class/android_usb/android0/idVendor
echo 2d04 > /sys/class/android_usb/android0/idProduct
echo 239 > /sys/class/android_usb/android0/bDeviceClass
echo 2 > /sys/class/android_usb/android0/bDeviceSubClass
echo 1 > /sys/class/android_usb/android0/bDeviceProtocol

echo LGE > /sys/class/android_usb/android0/f_rndis/manufacturer
echo 18D1 > /sys/class/android_usb/android0/f_rndis/vendorID
echo 1 > /sys/class/android_usb/android0/f_rndis/wceis

echo rndis > /sys/class/android_usb/android0/functions
echo 1 > /sys/class/android_usb/android0/enable

# Mount the real rootfs
mount /dev/mmcblk0p30 /mnt

# Stop blinking the white LED to show that we're ready
echo 255 > /sys/class/leds/white/brightness
echo 0 > /sys/class/leds/white/device/blink

# Unmount core filesystems
umount /proc
umount /sys

# Switch to the real rootfs
exec switch_root /mnt /sbin/init
To keep this shell script minimal, I've avoided using devtmpfs. So the dev tree in the minimal ramdisk needs to contain the device nodes necessary to support this process. The only node that is missing is the block device node for the tablet's user data partition. I therefore created this additional node, alongside making the shell script executable:
$ sudo chmod 755 init
$ sudo mknod -m 600 dev/mmcblk0p30 b 179 30
I then rebuilt the ramdisk in the usual way:
$ sudo bash -c "find . | cpio -o -H newc | gzip > ../initrd-rootfs.cpio.gz"
$ cd ../../out
$ mkbootimg --base 0 --pagesize 2048 --kernel_offset 0x80208000 --ramdisk_offset 0x82200000 \
      --second_offset 0x81100000 --tags_offset 0x80200100 \
      --cmdline 'console=ttyHSL0,115200,n8 msm_rtb.filter=0x3F ehci-hcd.park=3' \
      --kernel ../kernel/msm/arch/arm/boot/zImage \
      --ramdisk ../initrd/initrd-rootfs.cpio.gz -o boot.img

Attempting the Boot


I can now try booting the tablet into Debian:
$ fastboot boot boot.img
downloading 'boot.img'...
OKAY [  0.256s]
booting...
OKAY [  0.026s]
finished. total time: 0.282s
$ telnet 192.168.1.1
Trying 192.168.1.1...
Connected to 192.168.1.1.
Escape character is '^]'.
Debian GNU/Linux 7
nexus-7 login: simon
Password: 
Linux nexus-7 3.4.0-g03485a6-dirty #2 SMP PREEMPT Tue Nov 4 21:32:26 GMT 2014 armv7l

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
simon@nexus-7:~$ mount
/dev/mmcblk0p30 on / type ext4 (rw,relatime,data=ordered)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=185120k,mode=755)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev type tmpfs (rw,relatime,size=10240k,mode=755)
tmpfs on /run/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=370220k)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620)
Hurrah!! Debian is now running, from the user data partition! There's still a little way to go before it can be self-sufficient, but this is definitely awesome!

Next Time ...


... I'll see if I can get WiFi networking to work.

Monday 10 November 2014

Debian on Nexus 7: Experimenting with the Screen

The minimal userspace ramdisk that I created in my previous post is a useful environment for a little experimentation with the Nexus 7. In this post, I'm going to see if I can get the tablet's screen to show something.

The Linux Framebuffer


In earlier versions of Android, it is possible to display a simple boot logo whilst the OS is booting. This boot logo is placed in the file /initlogo.rle within Android's ramdisk, and it is the job of Android's init to read and display that logo. The code that displays the logo appears to interact with a framebuffer device (/dev/graphics/fb0) in a perfectly standard way. So, it seems that Android still retains the Linux framebuffer. This method of displaying a boot logo has since been removed, however.

I can use the minimal userspace ramdisk to investigate this further. From the kernel's log:
~ # dmesg | grep -i framebuffer
[    0.603515] FrameBuffer[0] 1200x1920 size=28016640 bytes is registered successfully!
[    0.608612] FrameBuffer[1] 1920x1080 size=0 bytes is registered successfully!
and from BusyBox's fbset tool (which I retained in my BusyBox build configuration):
~ # fbset

mode "1200x1920-0"
        # D: 0.001 MHz, H: 0.001 kHz, V: 0.000 Hz
        geometry 1200 1920 1200 5760 32
        timings 1000000000 60 48 6 3 32 5
        accel false
        rgba 8/24,8/16,8/8,8/0
endmode
There is certainly framebuffer support here! Exploring sysfs, I find:
~ # ls /sys/class/graphics/fb0
bits_per_pixel    msm_fb_fps_level  stride
blank             msm_fb_type       subsystem
console           name              uevent
cursor            pan               virtual_size
dev               power             vsync_event
mode              rotate
modes             state
~ # cat /sys/class/graphics/fb0/name
msmfb44_90701
The 1200x1920 resolution matches the Nexus 7's own screen resolution, so fb0 probably drives the tablet's screen. The 1920x1080 full HD framebuffer might drive the display connected to the tablet's HDMI port. For now, I'll focus on fb0. As indicated above, the kernel driver providing framebuffer support is msm_fb (located under drivers/video/msm in the Android kernel tree I'm using), and this is enabled by CONFIG_FB_MSM in the kernel configuration.

The Framebuffer Console


If msm_fb acts as a normal framebuffer driver, then I should be able to get it to work with Linux's framebuffer console driver (fbcon). This is CONFIG_FRAMEBUFFER_CONSOLE in the kernel configuration. The first time I tried this, I included the framebuffer console driver as a built-in. Unfortunately, that appeared to cause the kernel to hang whilst booting (it didn't seem to be an outright panic, as I'd expect the tablet to reboot if that occurred). So, I had to include the console driver as a module instead.

I revisited my kernel tree in ~/Android/Nexus7-Debian/kernel/msm, re-entered the menuconfig configuration tool and set CONFIG_FRAMEBUFFER_CONSOLE=m to include the framebuffer console driver as a module. When I previously built the kernel, I also set CONFIG_ANDROID=n. If I hadn't, then I'd at the very least want to set CONFIG_ANDROID_RAM_CONSOLE=n, to avoid any confusion with consoles.

I then rebuilt the kernel, and installed the new module into the ramdisk filesystem, as follows:
$ ARCH=arm SUBARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make -j8
$ sudo bash -c "PATH=$PATH ARCH=arm SUBARCH=arm CROSS_COMPILE=arm-linux-gnueabi-" \
      "make modules_install INSTALL_MOD_PATH=../../initrd/rootfs/"
I also rebuilt the ramdisk, and the boot image:
$ cd ../../initrd/rootfs
$ sudo bash -c "find . | cpio -o -H newc | gzip > ../initrd-rootfs.cpio.gz"
$ cd ../../out
$ mkbootimg --base 0 --pagesize 2048 --kernel_offset 0x80208000 --ramdisk_offset 0x82200000 \
      --second_offset 0x81100000 --tags_offset 0x80200100 \
      --cmdline 'console=ttyHSL0,115200,n8 msm_rtb.filter=0x3F ehci-hcd.park=3' \
      --kernel ../kernel/msm/arch/arm/boot/zImage \
      --ramdisk ../initrd/initrd-rootfs.cpio.gz -o boot.img
and deployed it to the tablet:
$ fastboot boot boot.img

Starting the Console


Next, I logged in with telnet and tried to load the framebuffer console driver:
$ telnet 192.168.1.1
Trying 192.168.1.1...
Connected to 192.168.1.1.
Escape character is '^]'.

nexus-7-initrd login: root
~ # modprobe fbcon
Success! Oh, wait a minute. Those are kernel stack traces! I can also read them from the telnet session with dmesg (after getting out of the modprobe with CTRL+C):
[   64.896728] BUG: scheduling while atomic: modprobe/172/0x00000002
[   64.896728] Modules linked in: fbcon(+) bitblit softcursor font
[   64.896728] [] (unwind_backtrace+0x0/0x11c) from [] (__schedule+0x78/0x628)
[   64.896759] [] (__schedule+0x78/0x628) from [] (schedule_timeout+0x294/0x34c)
[   64.896759] [] (schedule_timeout+0x294/0x34c) from [] (wait_for_common+0xd4/0x184)
[   64.896789] [] (wait_for_common+0xd4/0x184) from [] (msm_fb_pan_idle+0x48/0xa8)
[   64.896789] [] (msm_fb_pan_idle+0x48/0xa8) from [] (msm_fb_pan_display_ex+0x1a4/0x1d0)
[   64.896789] [] (msm_fb_pan_display_ex+0x1a4/0x1d0) from [] (msm_fb_pan_display+0x40/0x48)
[   64.896820] [] (msm_fb_pan_display+0x40/0x48) from [] (msm_fb_imageblit.part.18+0x5c/0x68)
[   64.896820] [] (msm_fb_imageblit.part.18+0x5c/0x68) from [] (bit_putcs+0x300/0x360 [bitblit])
[   64.896850] [] (bit_putcs+0x300/0x360 [bitblit]) from [] (fbcon_putcs+0xd4/0x118 [fbcon])
[   64.896850] [] (fbcon_putcs+0xd4/0x118 [fbcon]) from [] (fbcon_redraw.isra.9+0xdc/0x174 [fbcon])
[   64.896881] [] (fbcon_redraw.isra.9+0xdc/0x174 [fbcon]) from [] (fbcon_scroll+0x67c/0xc84 [fbcon])
[   64.896881] [] (fbcon_scroll+0x67c/0xc84 [fbcon]) from [] (scrup+0xb8/0xd4)
[   64.896911] [] (scrup+0xb8/0xd4) from [] (lf+0x28/0x68)
[   64.896911] [] (lf+0x28/0x68) from [] (vt_console_print+0x1c4/0x30c)
[   64.896911] [] (vt_console_print+0x1c4/0x30c) from [] (__call_console_drivers+0xb0/0xcc)
[   64.896942] [] (__call_console_drivers+0xb0/0xcc) from [] (console_unlock+0x150/0x240)
[   64.896942] [] (console_unlock+0x150/0x240) from [] (take_over_console+0x29c/0x2f4)
[   64.896972] [] (take_over_console+0x29c/0x2f4) from [] (fbcon_takeover+0x54/0xb4 [fbcon])
[   64.896972] [] (fbcon_takeover+0x54/0xb4 [fbcon]) from [] (fb_console_init+0x11c/0x14c [fbcon])
[   64.897003] [] (fb_console_init+0x11c/0x14c [fbcon]) from [] (do_one_initcall+0x90/0x160)
[   64.897003] [] (do_one_initcall+0x90/0x160) from [] (sys_init_module+0x18a0/0x1a68)
[   64.897033] [] (sys_init_module+0x18a0/0x1a68) from [] (ret_fast_syscall+0x0/0x30)
These are all bug checks (indicated by the "BUG:" prefix). The bit they seem to all have in common is:
(unwind_backtrace+0x0/0x11c) from [] (__schedule+0x78/0x628)
(__schedule+0x78/0x628) from [] (schedule_timeout+0x294/0x34c)
(schedule_timeout+0x294/0x34c) from [] (wait_for_common+0xd4/0x184)
(wait_for_common+0xd4/0x184) from [] (msm_fb_pan_idle+0x48/0xa8)
(msm_fb_pan_idle+0x48/0xa8) from [] (msm_fb_pan_display_ex+0x1a4/0x1d0)
(msm_fb_pan_display_ex+0x1a4/0x1d0) from [] (msm_fb_pan_display+0x40/0x48)
(msm_fb_pan_display+0x40/0x48) from [] (msm_fb_imageblit.part.18+0x5c/0x68)
(msm_fb_imageblit.part.18+0x5c/0x68) from [] (bit_putcs+0x300/0x360 [bitblit])
(bit_putcs+0x300/0x360 [bitblit]) from [] (fbcon_putcs+0xd4/0x118 [fbcon])
(fbcon_putcs+0xd4/0x118 [fbcon]) from [] (fbcon_redraw.isra.9+0xdc/0x174 [fbcon])
(fbcon_redraw.isra.9+0xdc/0x174 [fbcon]) from [] (fbcon_scroll+0x67c/0xc84 [fbcon])
(fbcon_scroll+0x67c/0xc84 [fbcon]) from [] (scrup+0xb8/0xd4)
(scrup+0xb8/0xd4) from [] (lf+0x28/0x68)
(lf+0x28/0x68) from [] (vt_console_print+0x1c4/0x30c)
(vt_console_print+0x1c4/0x30c) from [] (__call_console_drivers+0xb0/0xcc)
(__call_console_drivers+0xb0/0xcc) from [] (console_unlock+0x150/0x240)
(console_unlock+0x150/0x240) from [] (take_over_console+0x29c/0x2f4)
After a bit of research, this is what I think is going on. It appears that vt_console_print() (drivers/tty/vt/vt.c) is holding a spinlock whilst it carries out console operations. These operations call into fbcon and the msm_fb framebuffer driver, and this driver then requests a sleep. This is what the "scheduling while atomic" complaint is referring to. As detailed in section 5.5.2 of this page, a kernel thread may not sleep whilst it holds a spinlock. Unfortunately, the msm_fb framebuffer driver appears to heavily rely on the ability to sleep for periods of time. The function msm_fb_pan_idle() (drivers/video/msm/msm_fb.c) appears in all of these stack traces, and is called rather a lot in the msm_fb driver. The call that this function makes to wait_for_common() (through wait_for_completion_interruptible_timeout) results in the sleep which triggers the bug check.

It turns out that this problem has been seen before in other framebuffer drivers. Unfortunately, it looks like it would be a rather complex task to work around this in the msm_fb driver. Also, its hopefully unlikely that I'll run into the same problem when I try to get an X server to drive the framebuffer. So, whilst I can't get a framebuffer console right now, all is not lost!

Next Time...


... I'll make my first attempt at booting into an actual Debian environment.

Thursday 16 October 2014

Debian on Nexus 7: Booting a Ramdisk

Now that I have a custom kernel image, I'll look at creating a ramdisk image to accompany this. The ramdisk image shall contain a minimal Linux (not Android!) user space with a suite of basic utilities. The kernel and ramdisk images shall be combined into a boot image, in the format that the Nexus 7's bootloader expects.

The Android Factory Image Ramdisk


Booting a minimal Linux user space is an important step in this project, as it is the first attempt at running a non-Android Linux on the tablet. But, the user space won't be capable of interacting with much of the tablet's hardware, such as the screen or WiFi adapter (this will be the focus of subsequent posts!). I need to make sure that I can make the user space do something to show that it booted successfully. And, ideally, I want some kind of shell access too!

Its time to return to the boot image file from the Android KTU84P factory image, which I unpacked in my first post. From this blog post (which, incidentally, is where I found unmkbootimg), I see that the ramdisk CPIO archive can be unpacked with:
$ mkdir ramdisk
$ cd ramdisk
$ gunzip -c ../ramdisk.cpio.gz | cpio -iu
$ ls
charger          init.flo.rc        sbin
data             init.flo.usb.rc    seapp_contexts
default.prop     init.rc            sepolicy
dev              init.trace.rc      sys
file_contexts    init.usb.rc        system
fstab.flo        proc               ueventd.flo.rc
init             property_contexts  ueventd.rc
init.environ.rc  res
and repacked with:
$ find . | cpio -o -H newc | gzip > ../ramdisk.cpio.gz
What is of most interest here are the init.*.rc init scripts. Android provides its own unique init, which acts on instructions from these scripts when certain events occur. The most interesting scripts are init.flo.rc and init.flo.usb.rc (which, of course, are both available in the AOSP sources). This is an interesting segment from init.flo.rc:
# White LED
chown system system /sys/class/leds/white/device/lock
chown system system /sys/class/leds/white/brightness
chown system system /sys/class/leds/white/device/grpfreq
chown system system /sys/class/leds/white/device/grppwm
chown system system /sys/class/leds/white/device/blink
This suggests that the tablet has a white LED, which can be controlled from user space. The kernel driver which controls this hardware exposes some kind of interface through sysfs (one of the special filesystems that the kernel provides, and which is usually mounted on /sys). A quick Google yields this forum post, which suggests that the following might be enough to switch the LED on and into a blinking state:
$ echo 255 > /sys/class/leds/white/brightness
$ echo 1 > /sys/class/leds/white/device/blink
This is enough to provide an indication that the minimal user space is booting, or has booted. The shell commands above can be used in the ramdisk's own init boot script. If the LED comes on, then that boot script is getting executed!

The next interesting segments come from init.flo.usb.rc:
on init
    write /sys/class/android_usb/android0/f_rndis/manufacturer LGE
    write /sys/class/android_usb/android0/f_rndis/vendorID 18D1
    write /sys/class/android_usb/android0/f_rndis/wceis 1

# rndis
on property:sys.usb.config=rndis
    stop adbd
    write /sys/class/android_usb/android0/enable 0
    write /sys/class/android_usb/android0/idVendor 18D1
    write /sys/class/android_usb/android0/idProduct 4EE3
    write /sys/class/android_usb/android0/bDeviceClass 239
    write /sys/class/android_usb/android0/bDeviceSubClass 2
    write /sys/class/android_usb/android0/bDeviceProtocol 1
    write /sys/class/android_usb/android0/functions rndis
    write /sys/class/android_usb/android0/enable 1
    setprop sys.usb.state ${sys.usb.config}
This is again using sysfs to interact with the Android USB gadget driver, which I deliberately kept in when building my custom kernel in the previous post. This gadget driver provides the various USB personalities that an Android device may assume, including amongst others CDC-ACM, RNDIS and the USB interface used by adb. RNDIS (a proprietary Microsoft protocol) is used when an Android device enters its USB tethering mode. It provides an IP network connection between a PC and the Android device over the USB cable. These segments appear to be the steps necessary to instruct the Android USB gadget driver to activate its RNDIS mode. It would be great to utilise this to get a remote shell from the minimal user space over, say, telnet!

As an aside, I did experiment with trying to get the CDC-ACM mode of the Android USB gadget driver working. CDC-ACM allows serial over USB, and might have been useful in setting up a virtual serial console (allowing boot messages and the like to be viewed). Unfortunately, despite trying all sorts of port settings on both sides, I couldn't achieve a stable connection. I could get some text through it (which turned out to be invaluable in grabbing small excerpts of the kernel log), but after a short while the serial connection would unfortunately hang (and no longer be usable until the next boot). I therefore tried the RNDIS mode instead, which did work reliably.

Creating the Minimal User Space


The first thing to do is to set up the basic filesystem structure of the ramdisk. I'll do this as root, so that the user and group IDs used in the filesystem are also set to root.
$ mkdir initrd
$ sudo mkdir initrd/rootfs
$ cd initrd/rootfs
$ sudo mkdir bin dev etc lib mnt proc root sbin sys tmp var
I also need some important static device nodes:
$ cd dev
$ sudo mknod -m 622 console c 5 1
$ sudo mknod -m 666 null c 1 3
Since I want a minimal user space, the obvious choice for the user space tools is BusyBox. This comes as a single binary, and a large collection of symbolic links. The binary provides the implementation of all the user space tools, and the specific tool that gets started depends on the symbolic link that we invoke it through. BusyBox is widely used in ramdisk images and embedded systems which require a minimal user space. So, I'll fetch the source for BusyBox 1.22.1:
$ cd initrd
$ mkdir busybox
$ cd busybox
$ wget http://busybox.net/downloads/busybox-1.22.1.tar.bz2
$ tar xjf busybox-1.22.1.tar.bz2
$ cd busybox-1.22.1
As with the kernel build, this will use the ARM cross compiler toolchain that I installed in my previous post. So, I'll again need to use the ARCH, SUBARCH and CROSS_COMPILE environment variables. I'll set up the default configuration, and enter the menuconfig utility (which works in the same way as the Linux kernel's own menuconfig tool):
$ ARCH=arm SUBARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make defconfig
$ ARCH=arm SUBARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make menuconfig
I'll first make the following configuration changes:
CONFIG_DESKTOP=n
CONFIG_INSTALL_NO_USR=y
CONFIG_STATIC=y
CONFIG_PREFIX=/home/simon/Android/Nexus7-Debian/initrd/rootfs
CONFIG_FEATURE_SYSTEMD=n
This isn't a desktop installation, and I don't want to use /usr in my ramdisk. I also want to force BusyBox to statically link against the C library, so that I don't have to worry about having to also deploy that to the ramdisk (and as we're only using a single binary, there isn't really a need for a shared C library). /home/simon/Android/Nexus7-Debian/initrd/rootfs is the full path to initrd/rootfs directory on my system. I don't want systemd, as this really will be a very minimal user space.

I also need to ensure that I have proper support for user accounts, and a DHCP and telnet server:
CONFIG_USE_BB_PWD_GRP=y
CONFIG_USE_BB_SHADOW=y
CONFIG_UDHCPD=y
CONFIG_TELNETD=y
CONFIG_FEATURE_TELNETD_STANDALONE=y
And to reduce the size a bit, I can remove some tools that I don't think I'll need (this is in no way an optimal list):
CONFIG_FEATURE_HWIB=n
CONFIG_RPM=n
CONFIG_RPM2CPIO=n
CONFIG_RUN_PARTS=n
CONFIG_START_STOP_DAEMON=n
CONFIG_LPD=n
CONFIG_LPR=n
CONFIG_LPQ=n
CONFIG_MAKEMIME=n
CONFIG_POPMAILDIR=n
CONFIG_REFORMIME=n
CONFIG_SENDMAIL=n
I'll now build and install BusyBox to the ramdisk filesystem:
$ ARCH=arm SUBARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make -j8
$ sudo bash -c "PATH=$PATH ARCH=arm SUBARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make install"
There are a couple more tweaks required. Firstly, BusyBox recommends that its binary has setuid set. This is done for completeness (although isn't critical as I'll be running as root anyway). Also, the kernel will expect to find init as /init (rather than /sbin/init). So I need to fulfil that requirement with a symbolic link:
$ sudo chmod 4755 ~/Android/Nexus7-Debian/initrd/rootfs/bin/busybox
$ sudo ln -s sbin/init ~/Android/Nexus7-Debian/initrd/rootfs/init
The minimal user space is now populated. It now needs to be configured.

Configuring the Minimal User Space


I'll lay down a basic configuration in the ramdisk's /etc directory. The paths below are relative to the root of the ramdisk filesystem (which on my system is ~/Android/Nexus7-Debian/initrd/rootfs). All files need to be owned by root, so I'll run my editor with sudo.

This sets up the root user and group, and permits remote logins as root (obviously, security is not a concern in this image).

/etc/passwd:

root::0:0:root:/root:/bin/sh

/etc/group:

root:x:0:

/etc/securetty:

pts/0

And this configures the DHCP server to operate over the RNDIS network connection:

/etc/udhcpd.conf:

# The IP lease block
start 192.168.1.20
end 192.168.1.254

# The network interface to use
interface usb0
option subnet 255.255.255.0
Finally, I need an appropriate boot script to start up the minimal user space, and the services it requires. I'll also try to turn on the tablet's white LED light, so I can see that the boot is actually happening. We're using BusyBox's init, so the boot script shall be a simple init script:

/etc/init.d/rcS:

#!/bin/sh

# Mount core filesystems
mount -t devtmpfs none /dev
mkdir /dev/pts
mount -t devpts none /dev/pts
mount -t proc none /proc
mount -t sysfs none /sys  
mount -t tmpfs none /tmp
mount -t tmpfs none /var

# Blink the white LED to show that we're booting
echo 255 > /sys/class/leds/white/brightness
echo 1 > /sys/class/leds/white/device/blink

# Setup logging
mkdir /var/log
syslogd

# Set a hostname
/bin/hostname nexus-7-initrd

# Disable the Android composite USB gadget driver
echo 0 > /sys/class/android_usb/android0/enable

# Setup USB device properties the driver should advertise
echo 18d1 > /sys/class/android_usb/android0/idVendor
echo 2d04 > /sys/class/android_usb/android0/idProduct
echo 239 > /sys/class/android_usb/android0/bDeviceClass
echo 2 > /sys/class/android_usb/android0/bDeviceSubClass
echo 1 > /sys/class/android_usb/android0/bDeviceProtocol

# Setup RNDIS-specific properties that the driver should advertise
echo LGE > /sys/class/android_usb/android0/f_rndis/manufacturer
echo 18D1 > /sys/class/android_usb/android0/f_rndis/vendorID
echo 1 > /sys/class/android_usb/android0/f_rndis/wceis

# Select the RNDIS function of the gadget driver
echo rndis > /sys/class/android_usb/android0/functions

# Re-enable the gadget driver, using the updated configuration
echo 1 > /sys/class/android_usb/android0/enable

# Wait a few seconds
sleep 5

# Bring up usb0, set a static IP and start udhcpd
/sbin/ifconfig usb0 up
/sbin/ifconfig usb0 192.168.1.1 netmask 255.255.255.0
/sbin/udhcpd /etc/udhcpd.conf

# Show a solid white LED to indicate that we're ready
echo 0 > /sys/class/leds/white/device/blink

# Run telnetd
while true
do
  /sbin/telnetd -l /bin/login -F -K
done

By mounting devtmpfs on /dev, I'm allowing the kernel to provide the appropriate device nodes. I won't get notifications of devices coming and going, and can't enforce any customisations (such as renames or particular permissions), but its good enough. I also ensure that I mount the other special kernel filesystems, and use two further tmpfs ramdisks for /tmp and /var. The configuration of the white LED and the Android USB gadget driver use the information from earlier in this blog post. As I'm keeping this very simple, I also start the services I require (namely the system logger (syslogd) and the DHCP server (udhcpd)) and continuously run the telnet server (telnetd) to permit remote logins. The five second sleep is there to "play it safe" (ensuring that the Android USB gadget driver is definitely ready before trying to configure networking).

The final touch is to ensure that the init script is executable:
$ sudo chmod 755 ~/Android/Nexus7-Debian/initrd/rootfs/etc/init.d/rcS
The ramdisk filesystem, with the minimal user space, is now ready. I can package this into the required gzipped CPIO archive (using the command mentioned earlier in this post):
$ cd ~/Android/Nexus7-Debian/initrd/rootfs
$ sudo bash -c "find . | cpio -o -H newc | gzip > ../initrd-rootfs.cpio.gz"

Creating the Boot Image


I now need to package up both the kernel from the previous post and the ramdisk CPIO archive into a boot image for the tablet. On my system, the kernel image is located at ~/Android/Nexus7-Debian/kernel/msm/arch/arm/boot/zImage and the ramdisk CPIO archive is at ~/Android/Nexus7-Debian/initrd/initrd-rootfs.cpio.gz.

I'll need mkbootimg, which I built in my first post. Recalling the parameters that unmkbootimg suggested when it unpacked the boot.img from the KTU84P factory image, I'll build my boot image as follows:
$ cd ~/Android/Nexus7-Debian
$ mkdir out
$ cd out
$ mkbootimg --base 0 --pagesize 2048 --kernel_offset 0x80208000 --ramdisk_offset 0x82200000 \
      --second_offset 0x81100000 --tags_offset 0x80200100 \
      --cmdline 'console=ttyHSL0,115200,n8 msm_rtb.filter=0x3F ehci-hcd.park=3' \
      --kernel ../kernel/msm/arch/arm/boot/zImage \
      --ramdisk ../initrd/initrd-rootfs.cpio.gz -o boot.img
$ ls
boot.img
It worked! Now to see if it'll actually boot.

Booting the Tablet


With the tablet off, I'll power it on with the volume down key held down to enter the bootloader menu (which depicts an Android robot lying on its back). I'll also connect the tablet to my PC with its USB cable. I also need the fastboot utility, which I've taken from an Android SDK. I'll first check that fastboot sees the tablet:
$ fastboot devices
01234567 fastboot
My tablet is already unlocked, as the bootloader menu states:
LOCK STATE - unlocked
It it weren't unlocked, I'd need to issue the fastboot command given below and follow the instructions on the tablet screen to confirm. Note that, for security reasons, unlocking the tablet will wipe all user data from it (effectively leaving you with a factory reset device).
$ fastboot oem unlock
I can now perform a one-time boot of my boot image, without flashing anything to the tablet. If anything went wrong, the tablet would be fine (powering it off and on again would return it to Android). I'll kick off the boot with:
$ fastboot boot boot.img
downloading 'boot.img'...
OKAY [  0.245s]
booting...
OKAY [  0.025s]
finished. total time: 0.270s
And now the moment of truth! After a brief pause, I see the white LED light begin to flash. Hurrah, this means that the kernel has booted and my init boot script is running! The tablet's screen still shows the bootloader menu, because my minimal user space environment has not tried to display anything (and the bootloader doesn't bother clearing the screen first). After a little while longer, the white LED stops blinking and remains constantly on. I also notice that my PC has a new network interface (usb0) with the IP address 192.168.1.20. This indicates that the RNDIS network connection is active, and that the tablet assigned my PC an IP address. I should now be able to establish a telnet connection to get a shell:
$ telnet 192.168.1.1
Trying 192.168.1.1...
Connected to 192.168.1.1.
Escape character is '^]'.

nexus-7-initrd login: root
~ #
Success!! I've booted the tablet with a minimal user space, and got access to a shell! For good measure:
~ # cat /proc/cpuinfo 
Processor       : ARMv7 Processor rev 0 (v7l)
processor       : 0
BogoMIPS        : 13.53

processor       : 1
BogoMIPS        : 13.53

processor       : 2
BogoMIPS        : 13.53

processor       : 3
BogoMIPS        : 13.53

Features        : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x1
CPU part        : 0x06f
CPU revision    : 0

Hardware        : QCT APQ8064 FLO
Revision        : 0000
Serial          : 0000000000000000
I'm now off to explore this a little! Obviously, once done I can power off the tablet in the usual way (reboot also works):
~ # poweroff
~ # Connection closed by foreign host.

Next Time...


... I'm going to see if I can get the screen to work!

Saturday 11 October 2014

Debian on Nexus 7: Building a Kernel

In this post, I'll look at building a custom Linux kernel image for the Nexus 7 which can be used later on to boot a ramdisk containing a minimal user space environment. I've decided to keep this as its own post, to keep it manageable. I'll cover the creation and booting of the ramdisk in my next post.

Cross Compiler Toolchain


As this is going to involve compiling C source code to run natively on the tablet, I'm going to need to get a C cross compiler toolchain. I could use Android's own ARM toolchain for this but, as it turns out, Ubuntu 14.04 also provides a suitable toolchain in the gcc-arm-linux-gnueabi package. This targets the armel Embedded Application Binary Interface (EABI), which Debian has supported from Lenny onwards. It would also be worth looking into targeting the armhf ABI in future, as this allows use of the hardware floating point unit.

To fetch the toolchain, along with a few other useful packages:
$ sudo apt-get install build-essential libncurses5-dev gcc-arm-linux-gnueabi
As the name suggests, build-essential provides some essential packages for software building. libncurses5-dev is required by the kernel's menuconfig configuration tool, to provide an ANSI-based user interface in a terminal.

Building the Kernel Image


And so now onto the main challenge of building the custom Linux kernel image. Unfortunately, this is one area where I can't avoid the Android sources. The branch of the kernel source provided by the AOSP for this tablet contains important drivers, which are not currently available in the mainline kernel. Whilst it would be entirely possible to port these drivers from the Android kernel source to a copy of the mainline source, this is not something I currently have the time or inclination to do. So, my kernel source tree shall be taken from the AOSP.

The starting point then is Google's building kernels guide. This guide lists the various kernel source trees provided by the AOSP, and how they match up to devices (listed by codename). As mentioned in the previous post, the Nexus 7 2013 WiFi-Only device is known by the codenames flo and razor. So, the guide indicates that the kernel/msm tree should be used, with the build configuration flo_defconfig. The guide also describes how the kernel version can be extracted from a kernel image file. Helpfully, my Nexus 7 displays the kernel version in its "About device" settings page. I'm using the KTU84P factory image, and the kernel version in this image is 3.4.0-g03485a6. The g03485a6 is the important bit, as this describes the git commit that the kernel source tree was checked out from when the kernel image was built. The leading g is dropped, giving the commit ID 03485a6. Looking through the web interface for the kernel/msm tree, I can see that this commit corresponds to the head of the android-msm-flo-3.4-kitkat-mr2 branch. Therefore, currently, simply checking out the head of this branch should give me what I need:
$ mkdir kernel
$ cd kernel
$ git clone https://android.googlesource.com/kernel/msm.git
$ cd msm
$ git checkout android-msm-flo-3.4-kitkat-mr2
$ ls
android           Documentation  Kbuild             mm              sound
AndroidKernel.mk  drivers        Kconfig            net             tools
arch              firmware       kernel             README          usr
block             fs             lib                REPORTING-BUGS  virt
COPYING           include        MAINTAINERS        samples
CREDITS           init           make_defconfig.sh  scripts
crypto            ipc            Makefile           security
There appears the nice and familiar Linux kernel source tree. Next, I need to select the appropriate kernel configuration for this tablet. This is achieved with:
$ ARCH=arm SUBARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make flo_defconfig
The ARCH, SUBARCH and CROSS_COMPILE environment variables indicate to the kernel build system the target architecture and compiler toolchain that it should use. Instead of specifying them on each call to make, I could also simply export them in the current shell but, for clarity, I'll continue to use them with each command. flo_defconfig is the name of the build configuration (as mentioned above).

Now I have a configuration which would produce a fully functional Android kernel for my tablet. But I want to customise it some more, to give me a kernel more suited for non-Android use. So, I'll enter the kernel's menuconfig configuration tool and make some tweaks:
$ ARCH=arm SUBARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make menuconfig
In the current configuration, certain important features have been disabled. I enabled the configuration options with the parameter names listed below:
CONFIG_DEVTMPFS=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
Typing / at any time in menuconfig will allow you to search for a configuration option by its parameter name. Once an option has been located, the spacebar toggles between unselected (< >), built-in (<*>) or module (<M>). In the notation used above (which is that of the configuration file itself), unselected is n, built-in is y and module is m. Sometimes, built-in or module will be unavailable for an option, depending on what the rest of the configuration looks like.

There's also a few options I don't need for now. Such as SELinux:
CONFIG_SECURITY_SELINUX=n
CONFIG_EXT4_FS_SECURITY=n
It is also possible to remove quite a few Android-specific drivers that make a Linux kernel capable of supporting the Android user space. This includes things such as Android's Binder IPC driver, and the Anonymous Shared Memory (ASH) subsystem. Unfortunately, disabling the Android-specific drivers did cause a couple of build errors:
  • The function kgsl_get_vma_from_start_addr() in drivers/gpu/msm/kgsl.c was no longer used because CONFIG_ASHMEM was unselected, leading to a fatal warning. Enclosing the function in a #ifdef CONFIG_ASHMEM...#endif block appeared to fix this issue.
  • The file drivers/staging/prima/CORE/WDI/TRP/CTS/src/wlan_qct_wdi_cts.c erroneously switched from including mach/msm_snd.h to including msm_snd.h because CONFIG_ANDROID was unselected. Simply removing this conditional behaviour appeared to fix this issue.
After fixing the source files above, these Android-specific drivers could be disabled by setting:
CONFIG_ANDROID=n
It probably wouldn't have done any harm to leave in these Android-specific drivers, I'm just curious to see if I can boot the kernel without them. However, one Android driver that I do not want to disable is the Android USB gadget driver (CONFIG_USB_G_ANDROID). As I'll describe later on, this will be fundamental in establishing basic communication (as in access to a shell) with the minimal user space environment.

Now I'm finished making my configuration changes, I'll exit menuconfig (saving the changes) and kick off the kernel build with:
$ ARCH=arm SUBARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make -j8
The -j argument governs the number of tasks that the build will perform in parallel. I tend to set this to twice the number of (real) CPU cores on my system (I want the build to finish ASAP!). My machine has an Intel Core i7-4770 processor which is quad-core, so I use -j8. The build still takes a good few minutes, however. The build then finishes with the usual notification:
  Kernel: arch/arm/boot/zImage is ready
And sure enough:
$ file arch/arm/boot/zImage
arch/arm/boot/zImage: Linux kernel ARM boot executable zImage (little-endian)
I have a kernel image! I'm now ready to create a ramdisk image to accompany it.

Next Time...


... I will create the ramdisk image and boot the tablet into a minimal Linux userspace environment.