It's essentially very simple, although the details are complicated. In essence, all the firmware does is to measure the time between flux transition pulses and send them to the PC.
There's an 8-bit counter attached to an 12MHz clock. This is used to measure the interval between pulses. A Cypress datapath state machine thingy using Verilog turns the timer, pulse and index information into a bytecode stream which encodes intervals, pulses, and whether the index hole has been seen.
This is then streamed back to the PC, where offline software decodes it: it does simple statistical analysis to guess the clock rate, then turns the pulses into a nice, lined up bit array and from there decodes the file system format.
Writing back to disk works the same way: bytes are streamed to the FluxEngine, where a different datapath state machine thingy (the PSoC5LP has 24, all independently programmable) to interpret the bytecodes, generate a stream of pulses to the disk.
The bytecode format represents an interval between pulses as a byte, a pulse as a byte, and the index hole as a byte. Timer overflows are handled by sending multiple intervals in a row.
An HD floppy has a nominal pulse frequency of 500kHz, and we use a sample clock of 12MHz (every 83ns). This means that our 500kHz pulses will have an average interval of 24. This gives us more than enough resolution. At this speed, in the 200ms that a 3.5" disk takes to rotate, we will see about 100,000 pulses. Each one is encoded as two bytes, one for the interval and one to generate the pulse; so that revolution generates 200kB of data. (Extremely approximately. The actual figure is less.)
(The clock needs to be absolutely rock solid or we get jitter which makes the data difficult to analyse, so 12 was chosen to be derivable from the ultra-accurate USB clock.)
Some useful and/or interesting numbers:
- nominal rotation speed is 300 rpm, or 5Hz. The period is 200ms.
- a pulse, as generated by the floppy drive electronics, is 150ns to 800ns long.
- a 12MHz tick is 83ns.
- MFM HD encoding uses a clock of 500kHz. This makes each recording cell 2us, or 24 ticks. For DD it's 4us and 48 ticks.
- a short transition is one cell (2us == 24 ticks). A medium is a cell and a half (3us == 36 ticks). A long is two cells (4us == 48 ticks). Double that for DD.
- pulses are detected with +/- 350ns error for HD and 700ns for DD. That's 4 ticks and 8 ticks. That seems to be about what we're seeing.
- in real life, pulses start going astray on a 3.5" drive after about 128 ticks == 10us. I haven't tried with a 5.25" drive yet as I don't have a 5.25" scratch disk.
I've got a lot of questions on this, and multiple Github issues of people debating it. It's complicated, but it's essentially a tradeoff between speed and complexity.
FluxEngine's read process involves generating a lot of data using a fairly brute force sampling approach --- about 150kB per disk revolution, and sometimes it needs to record multiple revolutions. Most microcontrollers don't have enough RAM to buffer this, so instead I have to stream it over USB back to the host PC in real time. The disk won't wait, so I need to stream data faster than the disk is producing it: the total is about 800kB/s.
Handling USB is pretty CPU-hungry, so my candidate microntroller has to be able to cope with the ruinously strict real-time requirements of the sampler's 12MHz clock as well as keeping up with 13,000 USB interrupts a second (one for each 64-byte frame) in order to transfer the data.
The Atmels and STM32s I found were perfectly capable of doing the real-time sampling, using hand-tool assembly, but I very much doubt whether they could do the USB streaming as well (although I'd like to move away from the Cypress onto something less proprietary and easier to source, so I'd like to be proven wrong here).
The Raspberry Pi easily has enough processing power and memory, but it's also got terrible GPIO pin read performance --- about 1kHz. That's a long way from the 12MHz I need.
The PSoC5LP part I'm using has enough CPU to handle the USB side of things, and it also has a whole set of FPGA-like soft programmable features, including 24 mini-ALU systems that are ideally suited to exactly this kind of sampling. I can read the disk and generate the byte stream describing the flux pattern entirely in 'hardware', without involving the main CPU at all. This is then DMAed directly into a set of ring buffers read for the USB system to pick up and relay back to the PC. It's incredibly simple and works well. (The same applies to writing flux back onto the disk.)
The development board I'm using, the CY8CKIT-059, also has another big advantage: it's the right shape. It's got 17 holes in a row connected to GPIO pins, and it's a native 5V part, which means I can just connect a floppy drive connector directly to the board without needing to build any hardware. No adapter board, no level shifting, no special cable, nothing. This makes the FluxEngine hardware incredibly easy to assemble, which therefore means cheap.
Speaking of which, the CY8CKIT-059 is $10. (Before shipping, which is admittedly expensive.)
-
The floppy disk user's guide: an incredibly useful compendium of somewhat old floppy disk information --- which is fine, because floppy disks are somewhat old.
-
The TEAC FD-05HF-8830 data sheet: the technical data sheet for a representative drive. Lots of useful timing numbers here.
-
KryoFlux stream file documentation: the format of KryoFlux stream files (partially supported by FluxEngine)