Posted on 21 March 2009 by Shawn
I am headed down to Phoenix for the Air Show. See you all on 03/28/09.
Posted on 17 March 2009 by Shawn
I posted this message to the x264 list this evening (see below).
I just had the thought that there were a couple of folks in
ua-developers interested in cluster programming techniques. For all it's
faults, MPI (with C, C++ or Fortran) is still the 'gold standard' in
academia.
While my code is pretty bad, it just may provide a simple introduction
to message passing with MPI for ridiculously parallel jobs (which I made
video encoding into by arbitrarily chunking the input).
It's not much, but maybe someone wants to pick this project up on their
shoulders? Internet fame and glory await (distributed encoding is always
a popular topic on the internet, piracy being what it is these days).
The only thing this really needs is a frame server (define MPI message
that properly represents a frame, then some simple logic to send them)
and then working to make the frame server intelligent (i.e. work closely
with the deeper logic bits of x264) [Well, maybe some optimization to
make sure that the frame server overhead doesn't clobber the
parallelization gains].
I am too busy at work to make this a priority, but I'd be willing to
help out where I can. Hopefully, soon enough you guys will have a proper
cluster at HACKS on which to try this stuff out.
Peace,
Shawn
Subject: Re: [x264-devel] implementing Cluster farming
From: Shawn Nock
<nock@nocko.net>Date: Tue, 17 Mar 2009 23:06:42 -0700
To: Mailing list for x264 developers
<x264-devel@videolan.org>
I actually worked on this for a bit as a weekend hack / POC. I
implemented a
*very basic* [read: not looking for criticism, I know it
is terrible and borderline worthless] x264-mpi workflow last year.
If anyone cares I put it on my gitweb:
http://git.nocko.net/?p=x264-mpi
The primary commitdiff of interest is here:
http://git.nocko.net/?p=x264-mpi;a=commitdiff;h=2200ac1a260a20085b2df588936911338f486f95
although there are useful patches (multipass support) later on (look for
commits by 'Shawn Nock').
There is no frame server (so it relies on shared storage). I am sure
that it breaks a lot of optimizations (specifically scene detection,
which is right out).
In a nutshell, It counts the frames and splits the frames into groups
equal to the number of requested MPI processes. Then each process
encodes a frame group separately, outputting to a file with a sequential
extension. Concatenating these files (I don't think I implemented
concatenation in the primary process... memory fades) produces coherent
output, but nothing resembling the baseline output of a normal x264 run.
Caveats aside, It compiles on x86_64 (mpich2) and ia64 (sgi propriatary
mpi for it's numalink technology). If you don't care that it butchers
encoding efficiency (and ultimately the output file), the raw fps
numbers are encouraging.
I'd love to see a proper MPI support and I could provide several testing
platforms if someone was seriously interested in doing this.
Peace,
Shawn
Posted on 12 March 2009 by Shawn
One of our esteemed members (Jason Katterhenry) is getting married this weekend and the majority of club members are attending. I imagine that we will all be too hung-over for a proper work day on Sunday. With that in mind, I am cancelling it straight away. Have a nice spring break everyone.
Posted on 08 March 2009 by Shawn
This is a basic summary, a more thorough post is available here.
The head node now has a POC nfsroot that boots into init (which it pulls via NFS). We are having some reliability problems after that (NFS time-out errors at different places in the init script). However, we learned a lot about how disk-less Linux works and we have a plan to fix it in the near future! The boot server can now boot the nodes, and give us BIOS, kernel, and (if we can sort out the NFS time-out) login on the serial server.
We also did quite a bit of work on the EVA5000 and come up empty handed. I tried to access the serial interface, only to learn that it was diagnostics only (no documented management interface) and Jason stalled on the Windows2000 management utilities. We’ll try again soon, I suppose that in the worst case we could use the one giant LUN as-is and not break it up… but let’s hope we can use the space a little more intelligently.
I also fixed the boot-time networking config on the server we made for the ua-developers club. I fat-fingered a conf file entry and the network didn’t come back after Jason power cycled the rack (QA testing I am sure
).
Stay tuned, the fun stuff is on the horizon.
Posted on 07 March 2009 by Shawn
ECE 232a @ 12:30p (until we get tired).
Things that I’d like to happen (in no particular order):
- Setup NAT on head node (so the nodes can talk to LDAP)
- Get the EVA1000 carved up and storage presented to the head node
- Develop a (rough) working compute node nfsroot
- Boot several of the nodes
- Test Myrinet connectivity, troubleshoot switch
I am not sure how much of this we’ll get done… but we’ll give it a go.
Posted on 05 March 2009 by Shawn
A few weeks ago I set-up Openfire Jabber Server for HACKS. Really I set this up for Jason, he was looking for a server-side way to login to all his IM accounts and log a history. In any case, now it is up and anyone with a HACKS LDAP account can login. Just fire up a Jabber/XMPP client and use the JID <HACKS username>@hacks.arizona.edu with your LDAP password and you are in. It has most common transports enabled (AIM, Yahoo, Jabber, and ICQ) as well as the IRC transport. S2S is confirmed working with jabber.org, gtalk, and my domain (nocko.net).
For those that are interested it is running IgniteRealtime’s OpenFire Server 3.6.3 and it is running in tugarin.hacks.arizona.edu. If you are having trouble or you don’t have a HACKS LDAP account send me and e-mail (n...@hacks.arizona.edu). If you decide to use the service, give me a holla. My JID is nock@nocko.net .
Posted on 01 March 2009 by Shawn
Head-node is up and serving dhcp, tftp and nfs. It took a while to work around a very broken old PXE stack on the cluster nodes (Tyan MPX). The nodes can boot, but not into anything really useful… memtest. I need to set up some NFS root fs areas for the nodes so that we can boot CentOS on them.
In other news, I got GM2 (Myrinet) compiled for 2.6.18. The kernel module loads and Ethernet emulation seems to work. As promissed I didn’t do any of the cluster integration with Myrinet, just got the kernel module compiled. It looks like OpenMPI supports GM, there is also MPICH-GM. Lustre has a custom driver for MX systems, but as we only have lowly GM boards… we’ll have to operate lustre over IP (over GM) if we want to implement it.