-
Notifications
You must be signed in to change notification settings - Fork 16
SPECS: SCoA Specifications
This page is about the codec used by CAPT printers made up to the mid-2000s. For information on the distinct Hi-SCoA compression codec, please see part 1.4 and 3 of the SPECS file in the tree.
SCoA (expanded to Smart Compression Architecture in some marketing materials) is a compression codec for 1-bit (bi-level) monochrome images which makes use of Run-Length Encoding (RLE) and Delta Encoding.
Images begin from one or more "key" or "seed" lines that encode a whole line. Key lines are RLE-compressed; contiguous repeated bytes are replaced by a single byte with a repeat count. Lines are dealt with a byte at a time, with each byte representing eight 1-bit pixels (this format is identical to Netpbm P4).
Delta encoding is applied on subsequent lines. Segments of unchanged bytes from the previous line are reduced to a single reference. Changed bytes are encoded with RLE. Lines can be ended early with an End-of-Line (EOL) opcode that fills the rest of the line with bytes from the previous line. Lines can be repeated with a lone EOL opcode.
The compressed stream is terminated with an End-of-Page opcode. Unlike lines, pages cannot end early. All lines on a page must be encoded. If the content on a page doesn't reach the bottom, or if the page is a blank page, the first blank line must be encoded as a key line, and each subsequent line must be encoded by EOL opcodes.
No SCoA colour devices are known to exist. Canon has claimed that full colour support was only introduced with the newer and distinct Hi-SCoA codec in a product brochure for the LBP2410.
Note: The SCoA format is not yet completely charted. Information in this section may be subject to change.
All opcodes are bit-length, but aligned to start and end on byte boundaries. This allows the bit stream to be processed as a byte stream. Multi-byte opcodes are read in big-endian order.
These opcodes encode data in a compressed form, and are shown in base-2 (0b
). There are three parts to data opcodes: the operation, counts and data. Operations and counts are interleaved in the first bytes of the opcode. The data always comes next. All counts are unsigned integers.
0b00XXXYYY
S0..Sn
The operation in this case is 0b00
. The first count is the 3-bit value 0bXXX
and the second count is the 3-bit value 0bYYY
. The multi-byte string S0
to Sn
is the data.
0b100WWWWW
0b00YYYXXX
S0..Sn
: operation 0b10000
, 8-bit count 0bWWWWWXXX
and 3-bit count 0bYYY
, data bytes S0
to Sn
0b101WWWWW
0b00YYYXXX
C
: operation 0b10100
, 8-bit count 0bWWWWWXXX
and 3-bit count 0bYYY
, datum byte C
0b100UUUUU
0b101WWWWW
0b11XXXYYY
C
: operation 0b10010111
, 8-bit count 0bUUUUUYYY
, 8-bit count 0bWWWWWXXX
, datum byte C
Control opcodes affect the decompressor's behaviour and are shown in base-16 (0x
). These commands have no counts and cannot be compressed (although having a compressible EOL would have further increased efficiency).
Please note that the opcodes have not yet been thoroughly verified.
There are three data decoding operations:
-
P(n)
: Copyn
bytes from the previous line, at the same offset/position as the current line -
R(n, C)
: Repeat,n
times, the single byteC
-
N(n, S0...Sn)
: Writen
new uncompressed bytesS0
toSn
.
The +
operator herein concatenates the results of the operations.
Opcode | Operation | Canonical Name (TBC) | Operation Description |
---|---|---|---|
0b00YYYXXX S0..Sn
|
P(0bXXX) + N(0bYYY, S0..Sn) |
CopyThenRaw |
0bXXX (0-7) bytes from previous line then 0bYYY (1-7) uncompressed bytes S0 to Sn
|
0b01YYYXXX C
|
P(0bXXX) + R(0bYYY, C) |
CopyThenRepeat |
0bXXX (0-7) bytes from previous line then 0bYYY (1?-7) repeats of C (minimum R() count may be 2, not 1; please see this comment in #33 in the original repo) |
0b11XXXYYY C S0..Sn
|
R(0bXXX, C) + N(0bYYY, S0..Sn) |
RepeatThenRaw |
0bXXX (1-7) repeats or C , then 0bYYY (1-7) uncompressed bytes S0 to Sn . |
0b100WWWWW 0b00YYYXXX S0..Sn
|
P(0bWWWWXXX) + N(0bYYY, S0..Sn) |
CopyThenRawLong |
0bWWWWWXXX (8-255) bytes from previous line, then 0bYYY (1-7) uncompressed bytes S0 to Sn . |
0b100WWWWW 0b01YYYXXX C
|
P(0bWWWWXXX) + R(0bYYY, C) |
CopyThenRepeatLong |
0bWWWWWXXX (8-255) bytes from previous line, then 0bYYY (1-7) repeats of C . |
0b101WWWWW 0b00XXXYYY C S0..Sn
|
R(0bWWWWXXX, C) + N(0bYYY, S0..Sn) |
RepeatThenRawLong |
0bWWWWWXXX (8-255) repeats of C , then 0bYYY (1-7) uncompressed bytes S0 to Sn
|
0b101XXXXX 0b01WWWYYY C S0..Sn
|
R(0bWWW, C) + N(0bXXXXXYYY, S0..Sn) |
RepeatThenRaw |
0bWWW (1-7) repeats of C , then 0bXXXXXYYY (8-255) uncompressed bytes S0 to Sn . |
0b101XXXXX 0b10YYYWWW C
|
P(0bWWW) + P(0bXXXXXYYY, C) |
CopyThenRepeatLong |
0bWWW (0-7) bytes from the previous line, then 0bXXXXXYYY (8-255) repeats of C
|
0b101XXXXX 0b11YYYWWW S0..Sn
|
P(0bWWW) + N(0bXXXXXYYY, S0..Sn) |
CopyThenRawLong |
0bWWW (0-7) bytes from the previous line, then 0bXXXXXYYY (8-255) uncompressed bytes S0 to Sn
|
0b100UUUUU 0b101XXXXX 0b10YYYWWW C
|
P(0bUUUUUWWW) + R(0bXXXXXYYY, C) |
CopyThenRepeatLong |
0bUUUUUWWW (8-255) bytes from previous line, then 0bXXXXXYYY (8-255) repeats of C . |
0b100UUUUU 0b101XXXXX 0b11YYYWWW S0..Sn
|
P(0bUUUUUWWW) + R(0bXXXXXYYY, S0..Sn) |
CopyThenRawLong |
0bUUUUUWWW (8-255) bytes from previous line then 0bXXXXXYYY (8-255) uncompressed bytes S0 to Sn . |
0x40 |
NOP |
NOP |
Dummy non-op. |
0x41 |
EOL |
EOL |
End of line. Fill the rest of the current line with bytes from the previous line from the same offset on the current line |
0x42 |
EOP |
EOP |
End of page/picture. Don't decompress anything past this point. |
0x9f /0b10011111
|
n + 248 |
Extend |
Add 248 to the byte count for P()+N() and P()+R() commands.Can be used N times in a row for 248 * N bytes. Identical to the first byte P(n)+N(m, S0..Sm) and P(n)+R(m, C) where n is from 248 to 255. |
Canonical names were taken from a disassembly of the captfilter
command from the original Canon driver. Copy
is currently understood as "copy from previous line", and Raw
is currently understood as "uncompressed".
The decoder should keep track of the position on the previous line. Every operation advances the position by its count, regardless of using the contents of the previous line or not. For example:
P(7) + N(2, [0xBA, 0xBE]) + P(7) + R(17, 0xCC) + P(7)
Copies 7 bytes from the previous line,
Skips the next 2 bytes and inserts [0xBA, 0xBE]
instead,
Copies another 7 from the previous line,
Skips the next 17 and inserts the same amount of 0xCC
bytes instead, and finally,
Copies yet another 7 from the previous line.
The behaviour of using P()
on the first line is unknown. As such, it is advised to assume an imaginary "previous line" entirely of zero (0x00)
bytes before the first line on the compressed image.
The P()
byte in P()+N()
and P()+R()
may be extended beyond 255 bytes by using one or more 0x9f
commands at the start of the opcode. For example, 0x9f
0b10000001
0b01010010
C
dumps 258 bytes from the previous line followed by two repeats of C
. Likewise, 0x9f
0x9f
0b10000001
0b01010010
C
does the same with 506 bytes. Only two 0x9f
's are necessary to reach the end of the line on an 8.5 inch wide page at 600 dpi.
It is yet to be known how captfilter
or printers handle the following:
-
Data that run past the end of line.
captfilter
's encoder is careful to keep lines shorter than the line size. Should excess bytes be discarded or carried over to the next line? -
Input images with a width that is not a multiple of eight. Where should padding be added, or should the image be rejected outright?
-
Output of
P()
opcodes on the first line of the compressed image. What would printers output?
Captdriver does not yet support SCoA compression, but there is ongoing work to implement support.
An experimental working SCoA decoder is available from the Studycapt repository. Instructions on its usage may be found in README.md
of the source tree.
Nicolas Boichat. LBP 810 and 1120 Driver SPECS file. Repository maintained by Alexander Sakharuk.
Canon (2003-12-01). Laser Shot LBP-2410 Colour Laser Printer. ICAN0275. SHA256: 250b5113a5986daf90ad2a44df683fa7afafb468a9635d4a8f1e86733b5d608b
PackBits Compression, described in detail in Section 9 of the TIFF 6.0 Specification (1992-06-03). SCoA appears to have been influenced by PackBits, which similarly divides the data into uncompressed and compressed regions.
pbm - Netpbm bi-level image format. See The Layout.
Mode 9 Compression as specified in Chapter 2 Section 6.3.8 of the Brother Printer Technical Reference Guide implements similar delta coding techniques on Brother printers.
- Alternate download link: Brother HL-2132 Manuals (click on the Download link to the Command Reference Guide for Software Developers. The manual is also linked from information pages for most other laser printer products on the Brother website.
This document is based on findings by Nicolas Boichat and documented in the source files of the LBP810 and 1120 driver.
Content in this wiki is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Any errors, omissions or suggestions? File an issue and apply the wiki
label.
Bookmarks
Unofficial Introduction to CAPT (Executive Summary)
Rootless Write Access To USB Devices
Other Canon Printer-Related Projects
SPECS: 0xA1A1
Command and Response Format
Search for pages starting with
-
SPECS
for notes on the operation of the CAPT data formats and communications protocol -
TESTING
for guidelines on testing Captdriver -
TIPS
for potentially helpful information on studying the project or the CAPT format-protocol