Wiznet for Magic-1

TL;DR - I found and fixed a silly mistake (made 3 years ago) that kept me from using a much faster internet connection for Magic-1.  The test program is live now, and serving web pages at http://www.magic-1.org.  Check it out (including a large image that would have taken minutes to transmit using the old mechanism).

And now, the way-too-long version……

My homebuilt CPU, Magic-1 (http://www.homebrewcpu.com), has been running almost continually for nearly a decade.  From the very beginning, a priority for me has been internet connectivity.  Magic-1, however, doesn’t have a native ethernet interface but instead relies on a pair of serial port for communication.  The first try at internet connectivity was the use of something called a “device server”.  This gizmo converts internet telnet session to serial communication, and allows you to treat an internet-connected computer to Magic-1 just as if it were a local serial terminal.  This was how I talked to Magic-1 over the internet from 2004 through mid-2005.

In June of 2005, I brought up the next round of connectivity.  Adam Dunkels wrote a very minimalist TCP/IP stack (uIP) that I ported to Magic-1.  In place of a native ethernet interface, I used old-school SLIP (Serial Line Internet Protocol) to pass IP packets between Magic-1 and one of my Linux computers over a serial port.  The Linux box then forwarded the packets out to the world.  Thus began Magic-1’s career as a web server.

In early 2007, I got Minix up and running on Magic-1.  At first, I continued to use uIP, but by the summer of 2007 I managed (with a lot of work) to get Minix’s full TCP/IP stack going.  Also using SLIP, this enabled me to support ftp, finger, telnet, httpd and various other standard services - but being more full-featured, it was considerably slower than uIP.

And that where we’ve been until now.  Full featured internet connectivity, but with very high overhead costs.  Managing TCP/IP connections is very computationally intensive for Magic-1.  Each packet must be copied and wrapped multiple times, checksummed, converted to compressed SLIP format and then pushed through a slow 9600-baud serial port.  It’s especially bad for telnet sessions, which tend to wrap every couple of characters transmitted in a separate packet.  If you’ve ever used telnet to log into Magic-1, you’ll find it seems painfully slow.  It’s much more responsive on a local terminal, and it’s bothered me that other folks can’t experience Magic-1 shell sessions the way I can locally.

In the Spring of 2011, Dave Conroy pointed me to a nifty little device - the Wiznet WIZ830MJ.  This is the development board for Wiznet’s w5300 - which is essentially a TCP/IP stack on a chip.  It has 8 sockets, which can be used as IP, TCP or UDP - and would offload most of the communications overhead from Magic-1.  Sounded really good, and I cobbled together an interface board that would map the WIZ830MJ into a region of Magic-1’s physical device address space (0x00F400 through 0x00F7FF).  Once there, I could use memory-mapped IO to communicate with it.  It was a fairly simple hack - just some address decoding logic and a voltage regulator to supply the 3.3V needed by the device from Magic-1’s normal 5V supply.

You can see some pictures of that 1st board here: NativeEthernetForMagic1 | BillBuzbee

The first problem was space.  There is simply no room in the card cage, but I was able to attach it as a daughter card on the top card of the cage.  However, doing so meant I needed to increase the length of the data and address busses by about 6 inches.  Still, I tried this and at first it seemed to work.  However, it soon became clear that it was unstable (and made Magic-1 unstable).  Things would work for a few minutes, and then Magic-1 would lock up.

I suspected the instability issue was due to the increased bus lengths.  In retrospect, this may indeed have been the problem, but it might also have been power distribution.  The daughter card was pulling 5V from the control card, and may have drawn too much.  In any event, I decided to take another approach and built a version of the card that attached directly to the wire-wrap (back) side of the backplane.  This would certainly take care of bus length issues.

Unfortunately, though it seemed to work a little better, it also suffered from stability issues.  I tried a bit of debugging with a scope and logic analyzer, but found no smoking guns.  Then, in the summer of 2011, I got exceedingly busy at work.  I removed the Wiznet device and put it in a box for later investigation.  And, that’s where it stayed until a few weeks ago.

I ran across my Wiznet boards while working on the new power supply for Magic-1.  Just for kicks, I re-installed the backplane board and tried out the old test program.  As expected, lots of instability.  Because of the recent power supply failure, I had power distribution on my mind, and thought I’d try giving the board it’s own feed.  I soldered on an external 5v & ground connector and tried again.  This time it seemed to be a little more stable, but within several hours of a stress test it would lock up the entire system.  The weird thing, though, was that although Magic-1 was unresponsive, it appeared to be running normally - just couldn't do I/O.  And the strangest thing was that a reset failed to bring it back - but a power cycle would.

That last clue was the key one - reset didn’t help, but a power cycle did.  My first thought was that it might be somehow related to the 3.3v voltage regulator on the board - but I couldn’t come up with a failure scenario that made sense.  Then I considered that fact that I was interfacing 3.3v CMOS with 5v TTL.  The Wiznet device claimed it could handle TTL inputs, and my first board design took assumed that.  However, for my backplane version I thought I’d be extra careful and went with 74HCTxx parts for the glue logic instead of 74Fxx or 74LSxx.  The 74HCT series are designed to bridge between TTL and CMOS.  Just to be sure I was using those parts correctly, I looked up some documentation of TTL family differences.  About halfway through the docs, I found my smoking gun:

“Because 74HCT is a CMOS process, no inputs must be allowed to float.  Tie unused inputs to power or ground rails.  Otherwise, outputs can become unstable and oscillate.”

And that’s what was happening.    I was only using 2 of the OR gates in a 74HCT32.  After working fine for a while, the unused floating inputs would cause the 2 gates I was using to go unstable and oscillate.  It just happened that the part of the address decoding circuit that used the 74HCT32 dealt with the “4” in the 0x00F400” physical address.  My other IO devices are on 0x00F800 through 0x00FFFF.  This meant that when I tried to access my other devices, the Wiznet device could come alive and try to take over the data bus.  That’s why things other than IO appeared to be working.  And, this oscillation would be unaffected by a system reset - only a power cycle would (temporarily) stop it.

Simple fix - and now my test program has been running solidly for a couple of days.  It may be awhile before I find the time to fully integrate the Wiznet device into Minix.  To get the test program running, I hacked the kernel to allow a user-space program to request that the Wiznet device page be remapped into its process space.  And, there’s another little hack that will cause the kernel to signal whatever user-space process owns the Wiznet device page whenever the WIZ830MJ’s service interrupt fires.

I suspect that once I write a proper device driver it should be fairly easy to make it work seamlessly with Minix’s INET TCP/IP stack (should just be able to replace the work from TCP & UDP on down).  But, I’ll need to do quite a bit of reading and digging to find where those few magic lines of code need to go.  Not sure when I’ll find the time, but meanwhile I can at least use the test program to replace the httpd server.
Shared publiclyView activity