The Cybernetic Penguin

Wednesday, October 10, 2018

A milestone for Enki

So, I can now write an Enki source file like this:

import sys
sys:write("Hello world!\n")
return 0

and compile like so:

./enki inanna_amd64_target.ini test.e
jo@isis:~/Enki$ md5sum a.enk
339294406a9953c910984b87a8bd2334  a.enk

and then run this one single binary, which is 100% native statically compiled machine code, on all three x86 OSes Enki currently supports:

jo@isis:~/Enki$ ./a.enk
Hello world!

D:\git\Enki>a.enk
Hello world!

Joels~Mac:Enki jo$ ./a.enk 
Hello world!

    What arcane wizardry is this?

    As far as I'm aware, no other language out there can do that. Either you have a binary compiled for a specific OS, or something like a Java jar with bytecode which a compiler on the target system will turn into code for a specific OS, or an interpreter on the target system. That's because each OS has its own standard library with its own ABI, its own calling convention (though MacOS and Linux use the same calling convention on x86-64; Windows, Microsoft being Microsoft, decided to do their own thing) and its own executable file format.

    So, how is this accomplished?

    I defined my own executable file format, which I called Inanna. There's a dynamic linker, written in Enki, which has a specific build for each OS. This knows how to load and relocate Inanna binaries and also contains the Enki standard library (such as it is! it's very much a temporary placeholder), which is cross-platform - sys:write is binary compatible across different OSes, but within the loader boils down to making a write syscall on Linux and MacOS or calling WriteFile on Windows. This is possible because I also define my own calling convention for Enki (which I would need to do anyway since I want language support for continuations rather than using a conventional stack). Since the loader is just the equivalent of ld-linux plus libc - no compiler, JIT or otherwise; the code is already ready to go - it's a lot more lightweight and faster to start than something like a JVM.
The executable format is defined as starting with '#!/usr/bin/env enkiloader' which means on Unix it transparently invokes the Enki dynamic linker; on Windows I just associate .enk files with it. In either case enkiloader gets invoked with the executable as argv[1] and does its thing.
    An executable file is actually less arcane than you might think - it really just consists of a series of records saying 'this bit of this file is code so you should mmap it as readable and executable, it's assuming it'll be loaded at this address but here's the bits you need to change if it gets put elsewhere, this bit of the file is constant data so map it read only' etc etc. By tagging the records with an architecture too I think it should be easy enough to make a cross-platform executable which contains both ARM and x86-64 programs and is able to share segments between the two where appropriate (strings for example come to mind). Noone else is doing this as far as I'm aware; closest are Mach-O fat binaries which are really two full executables smooshed together in an archive, or I guess maybe the MacOS app resource system, though that does mean your app is a directory, not just a single file.

    I guess the next step is to make this work on ARM - there are likely to be 32 vs 64 bit issues and the relocations will be different (on x86 we have the luxury of inline 64-bit constants, on ARM not so much). Plus, while Enki supports generic functions a la CLOS, the linker doesn't yet (and supporting that is going to be complicated given a program should be able to add specialisations to generic functions declared in the library). But I thought low level nerds might find this sort of thing interesting. :)

Wednesday, March 29, 2017

Enki - my pet compiler project

So, for the last few years I've worked a bit at a time on a pet project - a compiler for a language of my own design, Enki, to Linux/MacOS/Windows/Android native code. Enki syntactically looks like Python with a dash of Pascal mixed in and is a statically-typed, statically-compiled language. I'm intending to explore things like cross-platform native-code binaries, extensible OO and just generally cool and fun programming language features.

Here's interview classic FizzBuzz:

Uint64 max = 100
Uint64 count = 0
Byte[20] number
while count < max
    if (count % 15) == 0
        write("Fizzbuzz\n")
    elif (count % 3) == 0
        write("Fizz\n")
    elif (count % 5) == 0
        write("Buzz\n")
    else
        num_to_str(count, @number)
        write(@number)
        write("\n")
    count = count + 1

Also supported are nested functions, Python-style generators and CLOS-style multimethods/OO.

GitHub link is here: https://github.com/jotheberlock/Enki

Links to HTML documentation (note the internal links won't work, unfortunately):

Overview
FAQ
What the compiler does when

Wednesday, February 1, 2012

Wayland, Android and desktop Linux - a marriage made in heaven?

I noticed today that Wayland is rapidly reaching 1.0. It's no secret that many people (Canonical for one) see this as being the desktop graphics solution for the future...but I got to thinking; what if it could help us get proper Linux like Ubuntu onto Android devices too?

See, Android has its own userland, wildly different from a normal Linux system. I'd personally much rather be able to have the normal Posix/GNU/Linux API and utilities available to me, along with all that lovely desktop open-source software (which is why I was rather upset when Nokia canned http://en.wikipedia.org/wiki/MeeGo). The kernel is GPLed, so we generally have the ability enforced by law to build Linux kernels for these devices, and for most devices it's possible to take the ICS source and combine it with the binary blobs from the OEM and install your own version of Android, so you'd think bringing up something like Ubuntu on the phone or tablet of your choice would be simple enough. Sadly this turns out not to be the case, and the main problem is the graphics drivers.

Sadly, essentially all Android devices have closed-source graphics hardware. The support for it is provided in the form of a binary userspace library that provides OpenGL ES and is linked with Bionic, Android's stripped-down libc, and not glibc. That means for Linux programs to use it they have to use Android's userspace; so getting X11 running on an Android device would a) mean porting it to bionic (and it would not surprise me if bionic is missing some Posix stuff X would like) and b) using OpenGL ES as its driver backend, which to my knowledge noone has done. So while you do see Ubuntu running on Android tablets, the usual method is to run Ubuntu in a chroot with an X11 server running as an unaccelerated VNC server, then running an Android VNC viewer to actually see the desktop. No hardware acceleration whatsoever and an extra trip over the network to boot.

Ugh. This is bad enough in performance with an 800x480 phone. With a tablet it's untenable and will get worse when 'retina display' tablets become a thing. You pretty much need at least accelerated compositing and bitblt (for scrolling), which is why OEMs are required to provide hardware acceleration for 2d in ICS if they want Google certification.

However! Wayland is built to use OpenGL ES already, and it's small by design. So, what if we were to port Wayland to Bionic/Android and SurfaceFlinger? (while keeping the Wayland client libraries on glibc). The common cases of 'composite windows' and 'move windows around' becomes hardware accelerated, as it should be. You can build support for this into something like CyanogenMod and every CyanogenMod device can suddenly run Ubuntu, Debian and co. as a first class citizen. Or, you can not use the Android stuff and turn your Galaxy Tab or Nook Color into a proper Ubuntu tablet with just enough Android userland to run Wayland and deal with wifi, talking to the cellphone or anything else that needs a binary driver. Seems like a win-win to me.

Sunday, January 29, 2012

Taking another look at LLDB

I decided to take a small break from working on my ereader and see how lldb (the LLVM project's debugger) is coming along for Linux. Unfortunately, it doesn't seem to have got very far since the last time I looked at it, when I provided a patch to fix problems with ptrace() -

http://lists.cs.uiuc.edu/pipermail/lldb-dev/2011-October/000686.html
http://lists.cs.uiuc.edu/pipermail/lldb-dev/2011-October/000690.html

That fix is in, but there are various small problems in the source (mostly missed header includes) that prevent compilation, so I've resubmitted a patch for that.
There's a FreeBSD/Linux fork that some people are working on, but it seems the same problems apply there too, so I decided to supply compile patches for both -

http://lists.cs.uiuc.edu/pipermail/lldb-dev/2012-January/000783.html

The FreeBSD fork does at least provide a valid stack trace of sorts when debugging a random little test application that just loops and printf's 'Hello world', but it doesn't seem to be in main(), and attaching to an already running process straight-up segfaults. I guess I'll poke around a bit in there and see what I can come up with, because the project as a whole is pretty exciting. It's a shame it's not being worked on more out of the MacOS X world.

Wednesday, January 18, 2012

Calliope hits alpha

I've gotten to the stage where my ereader has the basic functionality I wanted - you can read books with it, you can correct mis-spelled words on the fly, and the corrections are persistent. So I put it up on the Android market in case anyone wants to play with it -

https://market.android.com/details?id=org.kde.necessitas.example.calliope

It's glitchy, partly because of some bugs in my code, partly because Qt for Android is in alpha and has its own quirks (for example, it's a known bug that the onscreen keyboard defaults to upper case for some reason, and there seems to be an issue with settings not being saved; this'll no doubt be corrected by the next Qt release). I've also started working on making the UI work differently for Android versus the desktop; the button bar shows up by default above the page for the desktop, and pops up with the menu button on Android.

Still, bugs and all, you can read with it, and I've added some nice-to-have as opposed to essential features as well - the reader uses a filesystem watcher on the directories in which it searches for books, so drop a new one in and it'll show right up on the menu, and it interacts with Windows/X11 session management so you can log off and on again and not lose your place.

The coolest thing I've added, though, is the filter manager. Basically, the reader works with a stream of elements parsed out of the HTML - some images, some pagebreaks, but mostly paragraphs of text. Filters operate at various points in the paragraph's transition from 'list of words with attributes (e.g. bold, italic)' to 'group of words at given x/y coordinates in a bounding box', and also when someone clicks/touches the screen inside a paragraph.

The spelling-correction filter is run before the text layout process; it has a map of corrections of the form 'the third word of the paragraph, which is coler, should actually be colour' and makes the appropriate alterations in the list.
There's also a dictionary-lookup filter which is invoked (if set as the active touch filter) when a word is pressed, after the text has been laid out. At some point I'll likely also add a filter that operates after text layout but before rendering to justify the text (such that it lines up on both left and right margins as opposed to the default ragged right alignment).

I've not done much with the dictionary yet, and the API needs work (it should be asynchronous for a start). That done, it would be easy enough to
for example look a word up on Wikipedia from within the application given Qt's http support. Right now, though, the only dictionary is for Latin (which I'm in the process of learning), which in itself took a bit of work. Latin is a highly inflected language - that is, where we in English add words to change the meaning of a word, it tends to use different endings instead. 'I have' is habeo, 'we have' is habemus, 'we were having' is habebamus, and so forth, so it's not a straightforward thing for a computer to go from a random Latin word to the canonical form in which it appears in a conventional dictionary.

There is a program that does know how to do this, though, using some very clever algorithms and knowledge of Latin grammar; it's known as Whitaker's Words, and it is open source. Unfortunately, its author made the somewhat...unusual choice of Ada as its implementation language; unsurprisingly, an Ada compiler is not part of the Android NDK. The 'nice' way to bring that capability to Android would be to reimplement the program in C++, but that would involve quite a bit of work, and this is more for my own use than anything else, so I took a quick and dirty route.

Available for Whitakers Words is a list of every word understood by the program in all its forms (so it would include habeo, habemus, habebamus etc). I wrote a little program which reads that list, invokes Whitakers Words on each one, and writes the output into a file, writing an index into another file of the form source word, current position in the output file, length of the string from Words. This takes quite a while to run (about as long as ICS takes to build on my machine) and generates about a 250 meg output file and 20 meg index. My dictionary loads the index the first time a word is queried and uses that to seek into the output file and pull out the word's definition (I originally tried simply using Qt's built-in IO facilities to write out the QHash into a binary index file, but that actually ended up producing a bigger file for some reason).

On the off chance this would be useful for someone else in the same situation, the utility is at

https://github.com/jotheberlock/whitakerwords

and the source for the dictionary is whitaker.cpp in Calliope's source.

Incidentally, some of the books I've been working with make me really sympathetic towards the developers of browsers (an ebook reader is, after all, functionally a simplified HTML renderer with some special needs). For the most part Calliope basically displays anything in <p> tags as paragraphs, but one book in particular was a long stream of text, not in any form of block tag, broken only with <br>'s at the end of each paragraph. From what I can find on the web this was all the rage back in about 1992. I put something in to deal with that case, but there's at least one other book out there where most of the text doesn't show up; I'm investigating why.

Saturday, December 24, 2011

How to get Android Icecream Sandwich working on a Pandaboard ES

I have, after some messing about, compiled Android from the AOSP source release and got it running on my Pandaboard ES. Here are some random hints for anyone else trying to do the same -

Most importantly, make sure you're pulling the latest master branch from AOSP. A fix went in on, oh, Tuesday or so that actually made the bootloader and fastboot work on the ES board. Without this, you are doomed. Thread here -

http://groups.google.com/group/android-building/browse_thread/thread/9d784fc702451c9f?pli=1

With this, the instructions in device/ti/panda/README more or less work out of the box (exception: fastboot flash userdata and fastboot flashall both need -p panda specified). Without it you get stuck at the Waiting for Omap43xx... step because fastboot doesn't recognise the ID of the ES board, only the original Pandaboard.

If using HDMI, make sure to use the connector on the corner of the board - I saw SGX crashes trying to use the other.

The engineering build is sloooooow and would be a pain to develop with, I think; the user-with-root build is still a bit sluggish at first before code gets JITed but is usable.

Don't be tempted to try and format the SD card yourself, as I was before finding out how to get fastboot working; there's a script floating around out there called omap3-mkcard.sh, but for me it didn't produce anything I could boot. It has a bug in it, too - there's a section that goes like this:

if [ -x `which kpartx` ]; then
kpartx -a ${DRIVE}
fi

If you actually have kpartx installed then this will 'hold onto' the partitions on your SD card, causing the filesystem creation code later to fail, every time, because the device is busy. Not sure how that got through testing, but it can be replaced with

sfdisk -R $DRIVE

Also, if you're adding udev rules so that you can use fastboot and adb as non-root, these are the entries needed for the ES (different from the plain Pandaboard) -

# fastboot protocol on panda (PandaBoard ES)
SUBSYSTEM=="usb", ATTR{idVendor}=="0451", ATTR{idProduct}=="d010", MODE="0600", OWNER="<user>"
# adb protocol on panda (PandaBoard ES)
SUBSYSTEM=="usb", ATTR{idVendor}=="0451", ATTR{idProduct}=="d101", MODE="0600", OWNER="<user>"

where <user> should of course be replaced with your login. On my (Kubuntu 11.10) system these appear to live in /lib/udev/rules.d and not /etc/udev

Hope this helps someone! I'm pretty happy with the board now it's running; I have accelerated 3d (make sure to get the latest 4.0.3 binary driver drop from the Nexus drivers page) and working wifi. No audio, but I can live with that for now. I was also pleasantly surprised by the build time - Google recommends an absolute monster of a build machine, but my fairly middle of the road triple-core machine compiled ICS from scratch with -j 6 in about two hours.

Thursday, September 29, 2011

Digging up some old code

A year or so ago I started writing my own little debugger, just basically so I could know a bit more about what's going on under the hood in such programs. It was a natural follow-on from writing my own compiler/linker, which I sort of stopped working on after I satisfied myself I knew how to write such a thing (and came across LLVM which does that sort of thing about a thousand times better than I was going to be able to do on my own). Life events got in the way of my doing too much with it at the time, but here's the code anyway for those who might be curious - http://hu.gs/~emily/debugtoy.tar.gz It uses the Linux ptrace() system call and my own code to parse ELF and DWARF binaries; there are libraries out there that can help you do the latter, but as I say I was doing this to learn my way around the format myself. It can attach to processes, halt/single-step them, display/edit registers and memory, figure out the name of the function you're in, and identify the line in the source code that corresponds with the current instruction pointer.

Thursday, August 18, 2011

Calliope, a Mobipocket (Kindle) compatible ereader

I've lately been working on writing an ereader in Qt, with Android in mind. I let the project sit for a few weeks because of a combination of personal life events and bugs in the (very much still in development) Android port of Qt, but a couple of days ago I achieved the milestone of being able to read a commercially-released book on my Galaxy Tab.

Why work on an ereader, one might well ask? Amazon already have one for Android, and it's quite nice. Well, one thing that always annoys me reading ebooks is the somewhat....variable quality of the spellchecking; it seems quite common for them to be poorly spelled compared to printed books. This is a hundredfold more true of public-domain books from places like Project Gutenberg that have been OCR-scanned and tossed up on the site. How nice it would be, then, to be able to edit the book in situ and correct such errors. I plan to allow this, not by physically editing the ebook file (I'm wary of the legal implications of modifying a copyrighted work) but by storing an overlay file which in essence says 'word 20 of paragraph 32 should be 'mistake' and not 'mistaek''. As a side benefit, I can also provide a filter ability which can automatically convert all spellings of a word from one form to another - 'any time you see 'color', substitute 'colour'' - it always feels a little weird to me, as a Briton resident in the US, to buy a book by a British author whose spelling has been Americanised. I'm not yet at the point of doing any of this; right now I'm happy simply to be able to read the text as written, but I figured I might as well document what I've done to get to that point.

The reader, like the Kindle, reads Mobipocket files. These were originally designed to be read on Palm OS devices, and are stored in Palm-style database file. These consist of some headers and a series of blocks of data, each of which can be individually compressed with a form of Lempel-Ziv encoding (similar to what gzip uses, for instance). Mobipocket books consist of header information including optional tags for things like ISBN numbers, followed by the blocks that comprise the actual book (compressed, each block limited to 4k compressed size), followed by any images, each in its own block (required to be GIF format). The book text itself is HTML 3.2 with some customisations for purposes such as referring to images. The first image in the book is by convention the book cover. Books bought from Amazon are encrypted; while that encryption has been cracked and I could probably add support for reading them, I haven't because I'm not sure of the legal implications, even though it would only be used for books I've legitimately purchased.

I've written a 'bookshelf' that looks in a standard directory (currently fixed by platform; ~/Documents on desktop Linux, /sdcard/kindle on Android), sniffs all files in it to see if they're ebooks it can read, and displays those books in a list. Choosing one fires up a Page widget which actually displays (currently) the img and p tags from the book.

Doing this is more involved than one might think. Firstly there's the challenge of turning those compressed buckets into an uncompressed text stream. I accomplish this by writing a custom QIODevice (the underlying abstraction for Qt's files, network sockets etc) which wraps the document text and uncompresses it on the fly. Since it's a QIODevice I can also easily write the uncompressed text to disc. The Mobipocket header includes information on the character format used in the book (generally either Latin-1 or UTF-8) so I can construct a QTextStream that will do the appropriate conversions.

The book being HTML 3.2, Qt's inbuilt XML parser naturally chokes on it (I really can't blame it), so I had to write my own simple HTML parser. I've written it in more of a SAX than a DOM style (i.e. it parses the HTML stream incrementally rather than all at once) because by the nature of the thing an ereader is only concerned with displaying a small part of the document at a time; parsing the whole book on opening it would both take time at startup and needlessly consume RAM.

The stream is then split into Elements, each representing a block to be displayed on the screen; currently image and paragraph elements, the latter consisting of string fragments, which are collections of words with the same attributes (bold, italic, etc). Each element can report a size to the page layout algorithm and render itself; paragraph elements are careful not to render text where it would be partially cut off by the bottom of the page. Currently, the page renders as many elements as it can before hitting the bottom of the page; hitting the right of the window to go to the next page moves them 'up' by the height of the page then resumes rendering beginning with the element that was at the end of the last page.

Unfortunately, since I don't want to keep all elements parsed forever, I don't really have a good way to go back at present; an ereader that only goes forwards is a bit limited! I'm trying to figure out both how to handle this and how best to store position within the book, bearing in mind these problems -

- The page can be resized both during reading and between runs of the application (including on Android; think of rotating the tablet, for instance). These will cause paragraphs to reflow, taking up more or less lines on the page.

- There are ways to put pagebreaks into the book, which I do not yet support but will need to to handle, for example, the end of chapters.

I think the way I'm going to approach it is by viewing the book as a very very tall virtual screen of fixed width; so elements will have a virtual y coordinate starting at 0 and going down all the way to the end of the book, divided into pages every height-of-window pixels. The page itself acts as a window onto this virtual screen; going backwards will involve putting the page back by its height, then reparsing the book from the beginning until reaching elements that are visible on the page. I'll have to evaluate this for speed; it may be worth cacheing the previous page or two's elements since generally readers don't go much further back than that.

As for resizing, as well as storing the virtual y coordinate of the page, I'll keep track of the topmost element on the page (and in the case of a paragraph the word within it). Resizing involves repaginating and parsing from the beginning until that element and word appear on a page, then displaying that page.

Source for the app is at https://github.com/jotheberlock/reader

Saturday, April 16, 2011

Baby steps in electronics

My first self-designed circuit - though it's a bit of a stretch to call it that. I've got 12v coming into the near side of the breadboard from a NiMH battery (and I am a bad engineer for using the same colour wire for live and ground there, oops). That's being shunted through an adjustable switching regulator to provide 7.2v on the power rails on the far side.

The reason for this is I've gotten the superstructure off my new RC tank, removed the built-in RC controller and am getting close to testing it out with the motor controllers I used in the old tank. I need 12V because basically all the electronics I intend to put on the rover (Fit-PC2, Kinect, Arbotix and AX-12 servos) want that; however, unlike my old tank which ran on 9.6v and could tolerate 12, this one is based on 7.2v.

I could shunt 12v into the motor controllers (they'll take up to about that much) and rely on PWM to bring the effective voltage down to something the motor can handle without burning up - but I'm a programmer, and therefore I trust hardware more than my software, during development at least. This way there's no way I can mess up and explode my motors, and as a side effect I imagine the regulator will prevent too much EMI getting back and messing with the PC.

Saturday, April 9, 2011

Fitting a touchscreen to the EeePC 701

I have an old netbook I acquired second-hand as a robot controller a couple of years ago - it's an EEE PC 701. I thought it might be a cool hack to fit a touchscreen to it, so I ordered a cheap resistive one from DealExtreme. A week and a half long boat trip from China later, it arrived last week.

Picture of what I got

It consists of a small USB hub with display controller (bottom of the picture) and what amounts to a glass plate with a ribbon connector.

Installing it turned out not to be all that much hassle - you have to remove six screws from the front of the LCD casing and pop the bezel off. This takes a little bit of effort with some sort of flat blunt tool, in my case a dinner knife -

Case reveals its secrets

then I slipped the touch panel in there and used sticky tape to secure it to the screen. It's hard to get good pictures, unfortunately, since I didn't take the bezel completely off.

Seated touchpanel

I unplugged the camera up at the top (it's horrible and I never use it), put that connector into the hub/controller's in port, put the ribbon connector into the display's connector, and that was it (let a veil be discreetly drawn over my first attempt, where I failed to distinguish 'USB IN' and 'USB 3' on the circuit board). The controller is tucked under the left speaker grille; it causes a slight bulge when the bezel is screwed back on, but I gather this is normal.

After some unsuccessful attempts with the Kubuntu 10.04 I had installed to get it to properly recognise the touchscreen, I decided to embrace the bleeding edge of technology and install Kubuntu 11.04 Beta - the world of touchscreen input advances swiftly in Linux these days. Unfortunately my display connector was mis-soldered (not an uncommon problem apparently) resulting in inversion of the X axis, but this proved easy to solve temporarily by messing around with the xinput command-line program. A bit more challenging was calibrating the thing - this is an eGalax USB-HID touchscreen, which the standard tslib/ts_calibrate stuff doesn't seem to want to talk to. I managed to hunt down this program, however, which did the trick nicely - it calibrated my screen, altered the running server's parameters on the fly, and gave me a configuration snippet to put in /etc/X11/xorg.conf.d which worked just fine. Why it's not in Kubuntu's apt repository yet I'm not sure, it's pretty useful.

I also gave android-x86 Gingerbread a go from a live USB stick, just to see what would happen, but it didn't recognise the touchscreen at all. That might be something I have a go at fixing down the line, but I figured I'd done enough for now.

Still to do with this netbook, at some point, is taking the bottom half apart so I can try reseating the display connector; my screen sometimes goes all white and flickery, especially if there's a lot of black onscreen, which googling seems to indicate is a hardware problem. Also still to do is figuring out if I want to make and attach some kind of stylus holder.

Thursday, March 24, 2011

My mobile object system

This is what I've been tinkering with for the last couple of weeks. Basically, it's a distributed-object system with persistence based on top of Qt. It allows me to write objects that can be moved from computer to computer and automatically be persisted to disc (or wherever else) - so for example I can log into a site on my tablet in a web browser then move that browser to my desktop PC, keeping my login credentials and page position - something similar to what Engadget calls the continuous client.

Code tarball is here - I've been building it using android-qt's branch of qtcreator on Kubuntu 10.10. It's very definitely a prototype and does not (obviously) represent production-quality code.

Detailed description is in the tarball and also here.

Purpose of this blog

I'm a British software engineer living in Ann Arbor, Michigan. My day job involves embedded Linux and Qt in an automotive context, but I get up to a variety of hobby projects in my spare time, mostly related to Linux, robotics and Android. I decided having a dedicated blog to document such things without boring my non-technical friends to tears might be a good idea.

First up, I'll be posting shortly about my project of the last few weeks, a system for cross-platform mobile objects based on Qt.