-
Notifications
You must be signed in to change notification settings - Fork 1
/
README
385 lines (278 loc) · 14.4 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
Zorbage: algebraic data types and algorithms for use in numeric processing
Developer Info:
How to include zorbage in your Maven project
Add the following dependency to your project's pom.xml:
<dependency>
<groupId>io.github.bdezonia</groupId>
<artifactId>zorbage</artifactId>
<version>2.0.4.7</version>
</dependency>
How to include zorbage in a different build system
See https://search.maven.org/artifact/io.github.bdezonia/zorbage/2.0.4.7/jar
for instructions on how to reference zorbage in build systems such as
Gradle or others.
Project Goals:
- provide a framework for reusable numeric algorithms
- support numeric computing in Java in an efficient manner
- provide code easier to develop and almost as fast as C++
- provide code that is faster and less error prone than Python/R/Matlab
- support very large data sets in an efficient manner
- break limitations of many programming languages in terms of the
computable types provided and the extensibility of such types
- do all this with a powerful set of simple abstractions
Contains support for:
- integers, rationals, reals, complex numbers, quaternions, octonions
- numbers, vectors, matrices, and tensors
- various precisions: 1-bit to 128-bit to unbounded (signed and unsigned and float)
- very large datasets (arrays, virtual files, sparse structures, JDBC storage)
- generic programming
- algebraic/group-theoretic algorithms
- procedural java with object oriented comforts
- the definition of your own types while reusing existing algorithms
Can you show me what Zorbage can do?
For some overviews see:
https://github.com/bdezonia/zorbage/tree/master/src/main/java/example
Once you've covered that if you have more questions look at Zorbage's
extensive test code at:
https://github.com/bdezonia/zorbage/tree/master/src/test/java/nom/bdezonia/zorbage
To see a few small stand alone programs have a look here:
https://github.com/bdezonia/zorbage/tree/master/example
Thanks, I'll look that up later. Can you describe what can I do with Zorbage?
Define multidimensional data sets with flexible out of bounds data
handling procedures. Zorbage has an excellent abstraction for multi-
dimensional data that is easy to understand and use and is at the same
time quite powerful.
Use data views of your multidimensional data source to very rapidly set
and get values. Data views allow one to write any sample visiting
algorithm desired. The views are especially good with nested for loop
approaches to iterating through data. Their speed is comparable to
direct 1-d array access. Pull out arbitrary planes of your multi-
dimensional data set with ease.
Use many different data types (133 and counting)
- signed integers (bits: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
14, 15, 16, 32, 64, 128, unbounded)
- unsigned integers (bits: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
14, 15, 16, 32, 64, 128)
- rational numbers (unbounded)
- floats (bits: 16, 32, 64, 128, unbounded)
- gaussian integers (bits: 8, 16, 32, 64, unbounded)
- complex numbers (bits: 16, 32, 64, 128)
- quaternions (bits: 16, 32, 64, 128)
- octonions (bits: 16, 32, 64, 128)
- Unicode16 characters and variable length strings and fixed length strings
- booleans
- n-dimensional real Points
- ARGB, RGB, and CIE LAB tuples
- vectors and matrices and tensors made of the various types
Define your own data types
Do you have a custom data type? (Like RNA base pairs?). Can you define operators
between elements? (Like when two base pairs are equal?). Then you can define
their algebra and reuse any zorbage algorithms that accept similarly defined
types.
A conversion api exists for moving between types accurately and efficiently.
There are types that are compound: complex numbers, quaternions, and
octonions based on any of the floating types.
Types can be stored in arrays, files, sparse structures, and JDBC
database tables.
You can allocate and use huge (length up to 2^63 elements) data structures.
Data access revolves around 1-d lists that act like arrays. Arrays can be
concatenated, trimmed, subsampled, masked, padded, readonly, as well as
other abstractions. At the heart, each multidim data source has one of
these arrays behind it. These arrays are not indexed by integers but
instead by longs and break many of the limitations of Java arrays.
Array storage can be in native types (for speed of access) or for many
integer types they can be bit encoded (to save space).
You can use existing or write your own generic algorithms that work with
all the types transparently. For instance Zorbage has one Sort algorithm
that can sort a list made of any of the above defined types while doing
no data conversions.
You can use existing or write your own algorithms that work with numbers,
vectors, matrices, and tensors. Zorbage includes 100's of predefined
algorithms too. It can find roots, find derivatives, and solve differential
equations numerically involving arbitrary scalar and vector functions and
procedures. It also includes algorithms from linear algebra, signal
processing, statistics, set theory, and analysis.
Algorithms include:
- Basic vector and matrix and tensor operations
- Transcendental functions of various precision floats and matrices
- Basic runge kutta ode solver
- FFT and Inverse FFT
- Convolutions and Correlations
- Resampling algorithms
- LU Decomposition and LU Solving (simultaneous equation solving)
- Basic statistical functions
- Most C++ STL algorithms replicated
- Numerous parallel algorithms are provided for quickly processing data.
Define complex data sampling algorithms from prebuilt sampling components.
Use type safe first class Function and Procedure objects. Pass them as
arguments to code that can transform your data quickly and generically.
Write methods that return Functions and Procedures if needed. Compute
values from user defined Functions and Procedures.
Use parsers to create Procedures from strings. These Procedures represent
equations that when fed values will compute a return value (the result of
applying the inputs to the equation). Equations can return numbers, vectors,
matrices, or tensors. The subcomponents of these can be reals, complexes,
quaternions, or octonions. Equations can be built out of numbers, input
variable references, constants (like E, PI, etc.) and typical functions
like sin(), atan(), exp(), log(), etc. For more info see:
https://github.com/bdezonia/zorbage/blob/master/EQUATION_LANGUAGE
Why Java?
- numerical computing in Java is under represented
- with Java there is no need to worry about memory or pointer bugs
- the JRE optimizes many object allocations into stack allocations
- Java is portable: Zorbage runs everywhere the JVM does
- Java is safer at runtime than C++ while still supporting good performance
- Java is faster at runtime than Python/R/Matlab while also avoiding simple
"oops, that typo in my code just killed my long run" situations
Why is Zorbage the way it is?
- procedural programming is efficient
- object oriented comforts can be used where needed
- generics allow for reusable algorithms. often one can write one
algorithm and use it for reals, complex numbers, quaternions,
matrices, etc.
- there is no telling what one bored developer can get up to when
looking for fun
Zorbage breaks Java barriers
- in Zorbage "arrays" are indexed by 64-bit long integers rather
than 32-bit integers. Relatedly data lists sizes can reach
2^63.
- in Zorbage, due to JVM optimizations, you can pass primitives by reference
- supports multidimensional arrays
- breaks the limitation to integer types being in byte aligned sizes
- supports unsigned numbers
- supports 128-bit signed and unsigned integers
- supports 128-bit floating point numbers
- supports 16-bit floating point numbers
- supports high precision floating point numbers
- supports unbounded ints
- supports (unbounded) rational numbers
- builds reals, complexes, quaternions, and octonions from any
of the floating types. You can write one algorithm and substitute
one runtime parameter to calculate in 16 bit, 32 bit, 64 bit, 128
bit or seemingly unbounded accuracy.
Zorbage is fast
- Zorbage can run numeric algorithms in speeds comparable to C++
- Some multithreaded algorithms are provided
- matrix multiplies
- resampling
- convolutions/correlations
- fills
- transforms
- data conversions
- apodizations
Zorbage is flexible
- supports n-dimensional data sets with flexible out of bounds data
handling procedures
- the n-dimensional data sets can use the equational language to
carefully and powerfully calibrate their axes.
Is there a way to get data into a Zorbage backed application?
Code has been written to load ECAT scan data using the zorbage
ecat library into Zorbage structures. You can find it here:
https://github.com/bdezonia/zorbage-ecat
Code has been written to load Java audio data using the zorbage
jaudio library into Zorbage structures. You can find it here:
https://github.com/bdezonia/zorbage-jaudio
Code has been written to load raster GIS data using the GDAL library
into Zorbage structures. You can find it here:
https://github.com/bdezonia/zorbage-gdal
Code has been written to load raster GIS data using the NetCDF library
into Zorbage structures. You can find it here:
https://github.com/bdezonia/zorbage-netcdf
Code has been written to load Nifti scan data using the zorbage
nifti library into Zorbage structures. You can find it here:
https://github.com/bdezonia/zorbage-nifti
Code has been written to load NMR data using the zorbage NMR
library into Zorbage structures. You can find it here:
https://github.com/bdezonia/zorbage-nmr
Code has been written to load common raster data formats using
the SCIFIO library into Zorbage structures. You can find it here:
https://github.com/bdezonia/zorbage-scifio
Is there a way to view Zorbage data?
Coding is underway on a data viewer. It is available now (alpha ready).
You can find it here: https://github.com/bdezonia/zorbage-viewer
Programming notes
Java 11 info
Zorbage has been compiled and tested using Maven with OpenJDK Java 11 on Linux
Java 8 info
Zorbage was previously compiled and tested using Maven with OpenJDK Java 8 on Linux
The Zorbage code has been scrubbed of API calls later than Java 8 and the
Zorbage library API should be compilable with Java 8.
Java 7 info
At one time Zorbage was compiled and tested using Maven with Oracle Java 7 on the
Macintosh. Recently the Zorbage code was updated to take advantage of some Java 8
features so it is possible it will not be callable from Java 7. YMMV
Other platforms
Zorbage has been compiled and tested with many versions of Eclipse and some versions
of IntelliJ.
Acknowledgements
Thank you Dr. Pepper for your timely and generous contributions to the
project. Zorbage would not be the same without you.
Partial bibliography
Books:
A Book of Abstract Algebra; Pinter
A Student's Guide to Vectors and Tensors; Fleisch
Advanced Engineering Mathematics; Wylie, Barrett
Algorithms for Computer Algebra; Geddes
Basic Partial Differential Equations; Bleecker
Calculus and Analytic Geometry; Thomas, Finney
Compilers; Aho, Sethi, and Ullman
Complex Variables and Applications; Brown, Churchill
Complex Variables with Applications; Wunsch
Computer Algebra and Symbolic Computation (2 vols); Cohen
Design Patterns, gang of four
Differential Equations; Ross
Differential Equations and Linear Algebra; Edwards and Penny
Digital Image Proccessing; Burger
Digital Image Processing; Gonzalez, Woods
Digital Signal Processing; Proakis
Discrete Time Signal Processing; Oppenheim and Schaffer
Div, Grad, Curl, and All That
Elements of Modern Algebra; Gilbert
Elements of Programming; Stepanov
From Mathematics to Generic Programming; Stepanov
Functions of Matrices; Higham
Handbook of Math and Computational Science; Harris, Stocker
Haskell School of Expression
Haskell: the Craft of Functional Programming; Thompson
Introduction to Abstract Algebra; Dubisch
Introduction to Algorithms; Cormen, Lesserson, Rivest, Stein
Introduction to Computer Simulation Methods; Gould
Introduction to Octonion and Other Non-associative Algebras in Physics; Okubo
Linear Algebra and Analytic Geometry for Physical Sciences; Landi, Zampini
Matrix Algebra; Robbin
Modern Mathematical Methods for Physicists and Engineers; Cantrell
NMR Data Processing; Hoch and Stern
Numerical Methods for Engineers; Chapra
Numerical Methods for Engineers and Scientists; Gilat
Numerical Methods for Scientists and Engineers; Hamming
Numerical Recipes; Press etal
Partial Differential Equations for Scientists and Engineers; Farlow
Probability and Statistics for Engineering and the Sciences; Devore
Programming Clojure; Miller, Halloway, Bedra
Real World Haskell
Rings, Fields, and Groups; Arnold
Schaum's Outline of Tensor Calculus; Kay
Scientific Computing With Python; Fuhrer, Solem, Verdier
Structure and Interpretation of Computer Programs; Abelson
Tensors Made Easy; Bernacchi
Tensor Calculus Made Simple; Sochi
The Art of Computer Programming; Knuth
Online manuals:
GNU Scientific Library documentation
Boost Library documentation
C++ STL documentation
Mathematica documentation
Matlab documentation
R documentation
Julia documentation
Haskell documentation
Ruby documentation
Pascal language definitions
Online articles:
Many articles on Mathworld and Wikipedia and StackExchange
Many other web pages and pdfs and slide presentations
Paying it forward
If you like Zorbage as a library or as a source of ideas then please
visit my daughter's band's BandCamp page, listen to their music, and
buy something if you like what you hear. Thanks.
https://dearmrwatterson.bandcamp.com/