-
Notifications
You must be signed in to change notification settings - Fork 10
/
NEWS
1560 lines (1017 loc) · 55.9 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
CHANGES IN VERSION 2.40.0
-------------------------
BUG FIXES
o Make sure that internal helper coerceToCompressedList() always
propagates the mcols.
CHANGES IN VERSION 2.38.0
-------------------------
NEW FEATURES
o Add terminators(), same as promoters() but for terminator regions.
CHANGES IN VERSION 2.36.0
-------------------------
SIGNIFICANT USER-VISIBLE CHANGES
o Add link to revElements() in man page for reverse().
BUG FIXES
o Fix is.unsorted() methods for Compressed[Integer|Numeric]List
objects (they were never working since their introduction years
ago).
CHANGES IN VERSION 2.34.0
-------------------------
SIGNIFICANT USER-VISIBLE CHANGES
o Improve error handling in AtomicList constructors when input is too big.
CHANGES IN VERSION 2.32.0
-------------------------
NEW FEATURES
o splitAsList() can now perform a "dumb split", that is, when
no split factor is supplied, 'splitAsList(x)' is equivalent
to 'unname(splitAsList(x, seq_along(x)))' but is slightly more
efficient.
SIGNIFICANT USER-VISIBLE CHANGES
o Add ellipsis argument (...) to the gaps() generic function.
CHANGES IN VERSION 2.30.0
-------------------------
SIGNIFICANT USER-VISIBLE CHANGES
o Like the DataFrame class defined in the S4Vectors package, classes
SimpleDataFrameList, CompressedDataFrameList, SimpleSplitDataFrameList,
and CompressedSplitDataFrameList, are now virtual. This completes the
replacement of DataFrame with DFrame announced in September 2019. See:
https://www.bioconductor.org/help/course-materials/2019/BiocDevelForum/02-DataFrame.pdf
CHANGES IN VERSION 2.28.0
-------------------------
SIGNIFICANT USER-VISIBLE CHANGES
o Replace dim(), nrow(), and ncol() methods for DataFrameList objects with
dims(), nrows(), and ncols() methods.
DEPRECATED AND DEFUNCT
o Deprecate dim(), nrow(), and ncol() methods for DataFrameList objects
in favor of the new dims(), nrows(), and ncols() methods.
CHANGES IN VERSION 2.26.0
-------------------------
NEW FEATURES
o Add commonColnames() accessor to get or set the character vector of
column names present in the individual DataFrames of a SplitDataFrameList
object.
o Implement unary + and - for AtomicList derivatives.
SIGNIFICANT USER-VISIBLE CHANGES
o Much improved error handling and messages in IRanges() constructor
function
DEPRECATED AND DEFUNCT
o Remove RangesList() constructor (was deprecated in BioC 3.7 and defunct
in BioC 3.8).
BUG FIXES
o Fix unplit() on named List objects.
o Fix findOverlapPairs() for missing subject (fixes #35).
o quantile() on an AtomicList object always returns a matrix (fixes #33).
o Fix which.min()/which.max() for CompressedNumericList objects (fixes #30).
o Export startsWith() and endsWith() methods for CharacterList/RleList
objects (fixes #26).
CHANGES IN VERSION 2.24.0
-------------------------
NEW FEATURES
o coverage() now supports 'method="naive"'. This is in addition to the
already supported methods "sort" and "hash". This new method is a slower
version of the "hash" method that has the advantage of avoiding floating
point artefacts in the no-coverage regions of the numeric-Rle object
returned by coverage() when the weights are supplied as a numeric vector
of type 'double'. See "FLOATING POINT ARITHMETIC CAN BRING A SURPRISE"
example in '?coverage'.
DEPRECATED AND DEFUNCT
o Removed RangedData class and anything related to RangedData objects.
BUG FIXES
o Fix bug in list element recycling.
CHANGES IN VERSION 2.22.0
-------------------------
SIGNIFICANT USER-VISIBLE CHANGES
o Resync with change to smoothEnds() in R 4.0.
In R 4.0, stats::smoothEnds() always returns an integer vector
when the input is an integer vector. smoothEnds() on an IntegerList now
reflects this: it returns an IntegerList object instead of a NumericList
object.
DEPRECATED AND DEFUNCT
o RangedData objects are now defunct.
RangedData objects are defunct in BioC 3.11. They were deprecated in BioC
3.9 and, before that, their use has been discouraged in favor of GRanges
or GRangesList objects since BioC 2.12, that is, since 2014.
BUG FIXES
o Fix restrict() method for RangesList objects for when ranges are dropped.
CHANGES IN VERSION 2.20.0
-------------------------
NEW FEATURES
o IPos objects now exist in 2 flavors: UnstitchedIPos and StitchedIPos
IPos is now a virtual class with 2 concrete subclasses: UnstitchedIPos
and StitchedIPos. In an UnstitchedIPos instance the positions are stored
as an integer vector. In a StitchedIPos instance, like with old IPos
instances, the positions are stored as an IRanges object where each range
represents a run of consecutive positions. See ?IPos for more information.
Old serialized IPos instances need to be converted to StitchedIPos
instances with updateObject().
o IPos objects now can hold names
o The IRanges() and IPos() constructors now accept user-supplied metadata
columns
o Add grep(), startsWith() and endsWith() methods for CharacterList
objects
SIGNIFICANT USER-VISIBLE CHANGES
o as.data.frame(IRanges) now propagates the metadata columns
o Move splitAsList() to the S4Vectors package
o Move S4 class "atomic" from the S4Vectors package
o No longer export %in% (was a leftover from an older time when the
package was defining an %in% method)
DEPRECATED AND DEFUNCT
o After being deprecated in BioC 3.9, the following RangedData methods
are now defunct: findOverlaps, rownames<-, colnames<-, columnMetadata,
columnMetadata<-, c, rbind, as.env, as.data.frame, and coercion from
RangedData to DataFrame.
o Remove the following RangedData methods:
- score, score<-, lapply, within, countOverlaps;
- coercions from list, data.frame, DataTable, Rle, RleList, RleViewsList,
IntegerRanges, or IntegerRangesList to RangedData.
These methods were deprecated in BioC 3.8 and defunct in BioC 3.9.
BUG FIXES
o Fix integer overflow issue in end() setter for IRanges objects.
CHANGES IN VERSION 2.18.0
-------------------------
NEW FEATURES
o Add some methods for CharacterList derivatives (nchar, substring,
substr, chartr, toupper, tolower, sub, gsub, grepl).
DEPRECATED AND DEFUNCT
o Deprecate RangedData objects.
The use of RangedData objects has been discouraged in favor of GRanges
or GRangesList objects since BioC 2.12, that is, since 2014. Developers
are required to migrate their code to use GRanges or GRangesList instead
of RangedData objects (the GRanges and GRangesList classes are defined
in the GenomicRanges package).
o Several RangedData methods are now defunct (after being deprecated in
BioC 3.8):
- score, score<-, lapply, within, countOverlaps;
- coercions from list, data.frame, DataTable, Rle, RleList, RleViewsList,
IntegerRanges, or IntegerRangesList to RangedData.
BUG FIXES
o Fix unlist() on a SimpleRleList object of length 0
o Fix drop() for FactorList derivatives
o Fix removed rownames upon replacing in a SplitDataFrameList
CHANGES IN VERSION 2.16.0
-------------------------
SIGNIFICANT USER-VISIBLE CHANGES
o Optimize unlist() on Views objects.
o Optimize range(), any() and all() on CompressedRleList objects.
o Optimize start(), end(), width() setters on CompressedRangesList objects.
DEPRECATED AND DEFUNCT
o Deprecate several RangedData methods:
- score, score<-, lapply, within, countOverlaps;
- coercions from list, data.frame, DataTable, Rle, RleList, RleViewsList,
IntegerRanges, or IntegerRangesList to RangedData.
RangedData objects will be deprecated in BioC 3.9 (their use has been
discouraged since BioC 2.12, that is, since 2014). Package developers
that are still using RangedData objects need to migrate their code to
use GRanges or GRangesList objects instead.
o The RangesList() constructor is now defunct (after being deprecated in
BioC 3.7).
BUG FIXES
o Fix DF[IRanges(...), ] on a DataFrame with data.frame columns.
o Make [[, as.list(), lapply(), and unlist() fail more graciously on
a IRanges object.
o NCList objects now properly support c().
CHANGES IN VERSION 2.14.0
-------------------------
NEW FEATURES
o Add the windows() generic with various methods. This is a "parallel"
version of window() for list-like objects i.e. it does
'mendoapply(window, x, start, end, width)' but uses a fast
implementation.
Also add heads() and tails() as convenience wrappers around windows().
They do 'mendoapply(head, x, n)' and 'mendoapply(tail, x, n)',
respectively, but use a fast implementation. They're replacements for
S4Vectors::phead() and S4Vectors::ptail() which are now deprecated.
o Add equisplit() to split a vector-like object into a specified number
of partitions with equal (total) width. This is useful for instance to
ensure balanced loading of workers in parallel evaluation.
o promoters() arguments 'upstream' and 'downstream' now can be integer
vectors parallel to 'x' (for consistency with the other intra range
transformations).
o The promoters() generic and methods get the 'use.names' argument.
o Add "resize", "flank", and "restrict" methods for Views objects.
o Add "as.integer" method for Pos objects (equivalent to pos()).
SIGNIFICANT USER-VISIBLE CHANGES
o The Ranges virtual class is now the common parent of the IRanges,
GRanges, and GAlignments classes (GRanges and GAlignments are defined
in the GenomicRanges and GenomicAlignments packages, respectively).
More precisely, Ranges is a virtual class that now serves as the parent
class for any class that represents a vector of ranges. The ranges can
be integer ranges (i.e. ranges on the space of integers) like in an
IRanges object, or genomic ranges (i.e. ranges on a genome) like in a
GRanges object. Note that because Ranges extends List, all Ranges
derivatives are considered list-like objects. This means that GRanges
objects and their derivatives are considered list-like objects, which
is new (even though [[ don't work on them yet, this will be implemented
in Bioconductor 3.8).
o Similarly the RangesList virtual class is now the common parent of the
IRangesList, GRangesList, and GAlignmentsList classes.
o IRanges objects don't support [[, unlist(), as.list(), lapply(), and
as.integer() anymore. This is a temporary situation only. These
operations will be re-introduced in Bioconductor 3.8 but with a
different semantic. The overall goal of all these changes is to bring
more consitency between IRanges and GRanges objects (GRanges objects will
also support [[, unlist(), as.list(), and lapply() in Bioconductor 3.8).
Non-exported IRanges:::unlist_as_integer() helper is a temporary
replacement for what unlist() and as.integer() used to do a IRanges
object.
o Move the pos() generic to BiocGenerics.
o Switch order of breakInChunks() arguments 'chunksize' and 'nchunk' to be
consistent with tileGenome().
o tile() and slidingWindows() now preserve names.
o Optimize [[<- on a CompressedList object. Was very inefficient. The
optimized method can be up to 100x faster or more on a long object.
o All the S4Vectors-specific material in the IRangesOverview.Rnw vignette
has moved to the new S4VectorsOverview.Rnw vignette located in the
S4Vectors package.
DEPRECATED AND DEFUNCT
o Deprecate the RangesList() constructor. IRangesList() should be used
instead.
o The "ranges" methods for Hits and HitsList objects are now defunct
(were deprecated in BioC 3.6).
o The "overlapsAny", "subsetByOverlaps", "coverage" and "range" methods
for RangedData objects are now defunct (were deprecated in BioC 3.6).
o The universe() getter and setter as well as the 'universe' argument of
the RangesList(), IRangesList(), RleViewsList(), and RangedData()
constructor functions are now defunct (were deprecated in BioC 3.6).
CHANGES IN VERSION 2.12.0
-------------------------
NEW FEATURES
o Add IPos objects for storing a set of integer positions where most of
the positions are typically (but not necessarily) adjacent.
o Add coercion of a character vector or factor representing ranges (e.g.
"22-155") to an IRanges object, as well as "as.character" and "as.factor"
methods for Ranges objects.
o Introduce overlapsRanges() as a replacement for "ranges" methods for
Hits and HitsList objects, and deprecate the latter.
o Add "is.unsorted" method for Ranges objects.
o Add "ranges" method for Ranges objects (downgrade the object to an
IRanges instance and drop its metadata columns).
o Add 'use.names' and 'use.mcols' args to ranges() generic.
SIGNIFICANT USER-VISIBLE CHANGES
o Change 'maxgap' and 'minoverlap' defaults for findOverlaps() and family
(i.e. countOverlaps(), overlapsAny(), and subsetByOverlaps()). This
change addresses 2 long-standing issues:
(1) by default zero-width ranges are not excluded anymore, and
(2) control of zero-width ranges and adjacent ranges is finally
decoupled (only partially though).
New default for 'minoverlap' is 0 instead of 1. New default for 'maxgap'
is -1 instead of 0. See ?findOverlaps for more information about 'maxgap'
and the meaning of -1. For example, if 'type' is "any", you need to set
'maxgap' to 0 if you want adjacent ranges to be considered as overlapping.
Note that poverlaps() still uses the old 'maxgap' and 'minoverlap'
defaults.
o subsetByOverlaps() first 2 arguments are now named 'x' and 'ranges'
(instead of 'query' and 'subject') for consistency with the
transcriptsByOverlaps(), exonsByOverlaps(), and cdsByOverlaps()
functions from the GenomicFeatures package and with the snpsByOverlaps()
function from the BSgenome package.
o Replace ifelse() generic and methods with ifelse2() (eager semantics).
o Coercion from Ranges to IRanges now propagates the metadata columns.
o Move rglist() generic from GenomicRanges to IRanges package.
o The "union", "intersect", and "setdiff" methods for Ranges objects
don't act like endomorphisms anymore: now they always return an
IRanges *instance* whatever Ranges derivatives are passed to them
(e.g. NCList or NormalIRanges).
DEPRECATED AND DEFUNCT
o Deprecate "ranges" methods for Hits and HitsList objects (replaced with
overlapsRanges()).
o Deprecate the "overlapsAny", "subsetByOverlaps", "coverage" and "range"
methods for RangedData objects.
o Deprecate the universe() getter and setter as well as the 'universe'
argument of the RangesList(), IRangesList(), RleViewsList(), and
RangedData() constructor functions.
o Default "togroup" method is now defunct (was deprecated in BioC 3.3).
o Remove grouplength() (was deprecated in BioC 3.3 and replaced with
grouplengths, then defunct in BioC 3.4).
BUG FIXES
o nearest() and distanceToNearest() now call findOverlaps() internally
with maxgap=0 and minoverlap=0. This fixes incorrect results obtained
in some situations e.g. in the situation reported here:
https://support.bioconductor.org/p/99369/ (zero-width ranges)
but also in this situation:
nearest(IRanges(5, 10), IRanges(1, 4:5), select="all")
where the 2 ranges in the subject are *both* nearest to the 5-10 range.
o Fix restrict() and reverse() on IRanges objects with metadata columns.
o Fix table() on Ranges objects.
o Various other minor fixes.
CHANGES IN VERSION 2.10.0
-------------------------
NEW FEATURES
o "range" methods now have a 'with.revmap' argument (like "reduce" and
"disjoin" methods).
o Add coercion from list-like objects to IRangesList objects.
o Add "table" method for SimpleAtomicList objects.
o The "gaps" method for CompressedIRangesList objects now uses a chunk
processing strategy if the input object has more than 10 million list
elements. The hope is to reduce memory usage on very big input objects.
DEPRECATED AND DEFUNCT
o Remove the RangedDataList and RDApplyParams classes, rdapply(), and the
"split" and "reduce" methods for RangedData objects. All these things
were defunct in BioC 3.4.
o Remove 'ignoreSelf' and 'ignoreRedundant' arguments (replaced by
'drop.self' and 'drop.redundant') from findOverlaps,Vector,missing method
(were defunct in BioC 3.4).
o Remove GappedRanges class (was defunct in BioC 3.4).
BUG FIXES
o Fix "setdiff" method for CompressedIRangesList for when all ranges are
empty.
o Fix long standing bug in coercion from Ranges to PartitioningByEnd when
the object to coerce has names.
CHANGES IN VERSION 2.8.0
------------------------
NEW FEATURES
o "disjoin" methods now support 'with.revmap' argument.
o Add 'invert' argument to subsetByOverlaps(), like grep()'s invert.
o Add "unstrsplit" method for RleList objects.
o findOverlapPairs() allows 'subject' to be missing for self pairing.
o Add "union", "intersect" and "setdiff" methods for Pairs.
o Add distance,Pairs,missing method.
o Add ManyToManyGrouping, with coercion targets from FactorList and
DataFrame.
o Add Hits->List and Hits->(ManyToMany)Grouping coercions.
o Add "as.matrix" method for AtomicList objects.
o Add "selfmatch", "duplicated", "order", "rank", and "median" methods
for CompressedAtomicList objects.
o Add "anyNA" method for CompressedAtomicList objects that ensures
recursive=FALSE.
o Add "mean" method for CompressedRleList objects.
o Support 'global' argument on "which.min" and "which.max" methods for
CompressedAtomicList objects.
SIGNIFICANT USER-VISIBLE CHANGES
o Make mstack,Vector method more consistent with stack,List method.
o Optimize and document coercion from AtomicList to RleViews objects.
DEPRECATED AND DEFUNCT
o Are now defunct (were deprecated in BioC 3.3):
- RangedDataList objects.
- RDApplyParams objects and rdapply().
- The "split" and "reduce" methods for RangedData objects.
- The 'ignoreSelf' and/or 'ignoreRedundant' arguments of the
findOverlaps,Vector,missing method (a.k.a. "self findOverlaps" method).
- grouplength()
- GappedRanges objects.
BUG FIXES
o Fix special meaning of findOverlaps's maxgap argument when type="within".
o isDisjoint(IRangesList()) now returns logical(0) instead of NULL.
o Fixes to regroup() and Grouping construction.
o Fix rank,CompressedAtomicList method.
o Fix fromLast=TRUE for duplicated,CompressedAtomicList method.
CHANGES IN VERSION 2.6.0
------------------------
NEW FEATURES
o Add regroup() function.
SIGNIFICANT USER-VISIBLE CHANGES
o Remove 'algorithm' argument from findOverlaps(), countOverlaps(),
overlapsAny(), subsetByOverlaps(), nearest(), distanceToNearest(),
findCompatibleOverlaps(), countCompatibleOverlaps(), findSpliceOverlaps(),
summarizeOverlaps(), Union(), IntersectionStrict(), and
IntersectionNotEmpty(). The argument was added in BioC 3.1 to facilitate
the transition from an Interval Tree to a Nested Containment Lists
implementation of findOverlaps() and family. The transition is over.
o Restore 'maxgap' special meaning (from BioC < 3.1) when calling
findOverlaps() (or other member of the family) with 'type' set to
"within".
o No more limit on the max depth of *on-the-fly* NCList objects. Note that
the limit remains and is still 100000 when the user explicitely calls the
NCList() or GNCList() constructor.
o Rename 'ignoreSelf' and 'ignoreRedundant' argument of the
findOverlaps,Vector,missing method -> 'drop.self' and 'drop.redundant'.
The old names are still working but deprecated.
o Rename grouplength() -> grouplengths() (old name still available but
deprecated).
o Modify "replaceROWS" method for IRanges objects so that the replaced
elements in 'x' get their metadata columns from 'value'. See this thread
on bioc-devel:
https://stat.ethz.ch/pipermail/bioc-devel/2015-November/008319.html
o Optimized which.min() and which.max() for atomic lists.
o Remove the ellipsis (...) from all the setops methods, except the methods
for Pairs objects.
o Add "togroup" method for ManyToOneGrouping objects and deprecate default
method.
o Modernize "show" method for Ranges objects: now they're displayed more
like GRanges objects.
o Coercion from IRanges to NormalIRanges now propagates the metadata
columns when the object to coerce is already normal.
o Don't export CompressedHitsList anymore from the IRanges package. This
doesn't seem to be used at all and it's not clear that we need it.
DEPRECATED AND DEFUNCT
o Deprecate RDApplyParams objects and rdapply().
o Deprecate RangedDataList objects.
o Deprecate the "reduce" method for RangedData objects.
o Deprecate GappedRanges objects.
o Deprecate the 'ignoreSelf' and 'ignoreRedundant' arguments of the
findOverlaps,Vector,missing method in favor of the new 'drop.self' and
'drop.redundant' arguments.
o Deprecate grouplength() in favor of grouplengths().
o Default "togroup" method is deprecated.
o Remove IntervalTree and IntervalForest classes and methods (were defunct
in BioC 3.2).
o Remove mapCoords() and pmapCoords() generics (were defunct in BioC 3.2).
o Remove all "updateObject" methods (they were all obsolete).
BUG FIXES
o Fix segfault when calling window() on an Rle object of length 0.
o Fix "which.min" and "which.max" methods for IntegerList, NumericList,
and RleList objects when 'x' is empty or contains empty list elements.
o Fix mishandling of zero-width ranges when calling findOverlaps() (or
other member of the family) with 'type' set to "within".
o Various fixes to "countOverlaps" method for Vector#missing. See svn
commit message for commit 116112 for the details.
o Fix validity method for NormalIRanges objects (was not checking anything).
CHANGES IN VERSION 2.4.0
------------------------
NEW FEATURES
o Add "cbind" methods for binding Rle or RleList objects together.
o Add coercion from Ranges to RangesList.
o Add "paste" method for CompressedAtomicList objects.
o Add "expand" method for Vector objects for expanding a Vector object
'x' based on a column in mcols(x).
o Add overlapsAny,integer,Ranges method.
o coverage" methods now accept 'shift' and 'weight' supplied as an Rle.
SIGNIFICANT USER-VISIBLE CHANGES
o The following was moved to S4Vectors:
- The FilterRules stuff.
- The "aggregate" methods.
- The "split" methods.
o The "sum", "min", "max", "mean", "any", and "all" methods on
CompressedAtomicList objects are 100X faster on lists with 500k elements,
80X faster for 50k elements.
o Tweak "c" method for CompressedList objects to make sure it always
returns an object of the same class as its 1st argument.
o NCList() constructor now propagates the metadata columns.
DEPRECATED AND DEFUNCT
o RangedData/RangedDataList are not formally deprecated yet but the
documentation now officially declares them as superseded by
GRanges/GRangesList and discourages their use.
o After being deprecated in BioC 3.1, IntervalTree and IntervalForest
objects and the "intervaltree" algorithm in findOverlaps() are now
defunct.
o After being deprecated in BioC 3.1, mapCoords() and pmapCoords() are
now defunct.
o Remove seqapply(), mseqapply(), tseqapply(), seqsplit(), and seqby()
(were defunct in BioC 3.1).
BUG FIXES
o Fix FactorList() constructor when 'compress=TRUE' (note that the levels
are combined during compression).
o Fix c() on CompressedFactorList objects (was returning a
CompressedIntegerList object).
CHANGES IN VERSION 2.2.0
------------------------
NEW FEATURES
o Add NCList() and NCLists() for preprocessing a Ranges or RangesList
object into an NCList or NCLists object that can be used for fast overlap
search with findOverlaps(). NCList() and NCLists() are replacements for
IntervalTree() and IntervalForest() that use Nested Containment Lists
instead of interval trees. For a one time use, it's not advised to
explicitely preprocess the input. This is because findOverlaps() or
countOverlaps() will take care of it and do a better job at it (that is,
they preprocess only what's needed when it's needed and release memory
as they go).
o Add coercion methods from Hits to CompressedIntegerList, to
PartitioningByEnd, and to Partitioning.
SIGNIFICANT USER-VISIBLE CHANGES
o The code behind overlap-based operations like findOverlaps(),
countOverlaps(), subsetByOverlaps(), summarizeOverlaps(), nearest(),
etc... was refactored and improved. Some highlights on what has
changed:
- The underlying code used for finding/counting overlaps is now based
on the Nested Containment List algorithm by Alexander V.
Alekseyenko and Christopher J. Lee.
- The old algorithm based on interval trees is still available (but
deprecated). The 'algorithm' argument was added to most overlap-based
operations to let the user choose between the new (algorithm="nclist",
the default) and the old (algorithm="intervaltree") algorithm.
- With the new algorithm, the hits returned by findOverlaps() are not
fully ordered (i.e. ordered by queryHits and subject Hits) anymore,
but only partially ordered (i.e. ordered by queryHits only). Other
than that, and except for the 3 particular situations mentioned below,
choosing one or the other doesn't affect the output, only performance.
- Either the query or subject can be preprocessed with NCList() for
a Ranges object (replacement for IntervalTree()), NCLists() for a
RangesList object (replacement for IntervalForest()), and GNCList()
for a GenomicRanges object (replacement for GIntervalTree()).
However, for a one time use, it's not advised to explicitely preprocess
the input. This is because findOverlaps() or countOverlaps() will take
care of it and do a better job at it (that is, they preprocess only
what's needed when it's needed and release memory as they go).
- With the new algorithm, countOverlaps() on Ranges or GenomicRanges
objects doesn't call findOverlaps() to collect all the hits in a
growing Hits object and count them only at the end. Instead the
counting happens at the C level and the hits are not kept. This
reduces memory usage considerably when there is a lot of hits.
- When 'minoverlap=0', zero-width ranges are interpreted as insertion
points and are considered to overlap with ranges that contain them.
This is the 1st situation where using 'algorithm="nclist"' or
'algorithm="intervaltree"' produces different output.
- When using 'select="arbitrary"', the new algorithm will generally
not select the same hits as the old algorithm. This is the 2nd
situation where using 'algorithm="nclist"' or
'algorithm="intervaltree"' produces different output.
- When using the old interval tree algorithm, 'maxgap' has a special
meaning if 'type' is "start", "end", or "within". This is not yet
the case with the new algorithm. That feature seems somewhat useful
though so maybe the new algorithm should also support it? Anyway,
this is the 3rd situation where using 'algorithm="nclist"' or
'algorithm="intervaltree"' produces different output.
- Objects preprocessed with NCList(), NCLists(), and GNCList() are
serializable.
o The RleViewsList() constructor function now reorders its 'rleList'
argument so that its names match the names on the 'rangesList' argument.
o Minor changes to breakInChunks():
- Add 'nchunk' arg.
- Now returns a PartitioningByEnd instead of a PartitioningByWidth object.
- Now accepts 'chunksize' of 0 if 'totalsize' is 0.
o 300x speedup or more when doing unique() on a CompressedRleList object.
o 20x speedup or more when doing unlist() on a SimpleRleList object.
o Moved the RleTricks.Rnw vignette to the S4Vectors package.
DEPRECATED AND DEFUNCT
o Deprecated mapCoords() and pmapCoords(). They're replaced by
mapToTranscripts() and pmapToTranscripts() from the GenomicFeatures
package and mapToAlignments() and pmapToAlignments() from the
GenomicAlignments package.
o Deprecated IntervalTree and IntervalForest objects.
o seqapply(), seqby(), seqsplit(), etc are now defunct (were deprecated in
IRanges 2.0.0).
o Removed map(), pmap(), and splitAsListReturnedClass() (were defunct in
IRanges 2.0.0).
o Removed 'with.mapping' argunment from reduce() methods (was defunct in
IRanges 2.0.0).
BUG FIXES
o findOverlaps,Vector,missing method now accepts extra arguments via ...
so for example one can specify 'ignore.strand=TRUE' when calling it on a
GRanges object (before that, 'findOverlaps(gr, ignore.strand=TRUE)'
would fail).
o PartitioningByEnd() and PartitioningByWidth() constructors now check
that, when 'x' is an integer vector, it cannot contain NAs or negative
values.
CHANGES IN VERSION 2.0.0
------------------------
NEW FEATURES
o Add mapCoords() and pmapCoords() as replacements for map() and pmap().
o Add coercion from list to RangesList.
o Add slice,ANY method as a convenience for slice(as(x, "Rle"), ...).
o Add mergeByOverlaps(); acts like base::merge as far as it makes sense.
o Add overlapsAny,Vector,missing method.
SIGNIFICANT USER-VISIBLE CHANGES
o Move Annotated, DataTable, Vector, Hits, Rle, List, SimpleList, and
DataFrame classes to new S4Vectors package.
o Move isConstant(), classNameForDisplay(), and low-level argument
checking helpers isSingleNumber(), isSingleString(), etc... to new
S4Vectors package.
o Rename Grouping class -> ManyToOneGrouping. Redefine Grouping class as
the parent of all groupings (it formalizes the most general kind of
grouping).
o Change splitAsList() to a generic.
o In rbind,DataFrame method, no longer coerce the combined column to the
class of the column in the first argument.
o Do not carry over row.names attribute from data.frame to DataFrame.
o No longer make names valid in [[<-,DataFrame method.
o Make the set operations dispatch on Ranges instead of IRanges; they
usually return an IRanges, but the input could be any implementation.
o Add '...' to splitAsList() generic.
o Speed up trim() on a Views object when trimming is actually not needed
(no-op).
o Speed up validation of IRanges objects by 2x.
o Speed up "flank" method for Ranges objects by 4x.
DEPRECATED AND DEFUNCT
o Defunct map() and pmap().
o reduce() argument 'with.mapping' is now defunct.
o splitAsListReturnedClass() is now defunct.
o Deprecate seqapply(), mseqapply(), tseqapply(), seqsplit(), and seqby().
BUG FIXES
o Fix rbind,DataFrame method when first column is a matrix.
o Fix a memory leak in the interval tree code.
o Fix handling of minoverlap > 1 in findOverlaps(), so that it behaves
more consistently and respects 'maxgap', as documented.
o Fix findOverlaps,IRanges method for select="last".
o Fix subset,Vector-method to handle objects with NULL mcols(x) (e.g.
Rle object).
o Fix internal helper rbind.mcols() for DataFrame (and potentially other
tables).
o ranges,SimpleRleList method now returns a SimpleRangesList (instead of
CompressedRangesList).
o Make flank() work on Ranges object of length 0.
CHANGES IN VERSION 1.20.0
-------------------------
NEW FEATURES
o Add IntervalForest class from Hector Corrada Bravo.
o Add a FilterMatrix class, for holding the results of multiple filters.
o Add selfmatch() as a faster equivalent of 'match(x, x)'.
o Add "c" method for Views objects (only combine objects with same
subject).
o Add coercion from SimpleRangesList to SimpleIRangesList.
o Add an `%outside%` that is the opposite of `%over%`.
o Add validation of length() and names() of Vector objects.
o Add "duplicated" and "table" methods for Vector objects.
o Add some split methods that dispatch to splitAsList() even when only
'f' is a Vector.
o Add set methods (setdiff, intersect, union) for Rle.
o Add anyNA methods for Rle and Vector.
o Add support for subset(), with(), etc on Vector objects,
where the expressions are evaluated in the scope of the
mcols and fixed columns. For symbols that should resolve
in the calling frame, it is supported and encouraged to escape
them with bquote-style ".(x)".
o Add "tile" generic and methods for partitioning a ranges object
into tiles; useful for iterating over subregions.
SIGNIFICANT USER-VISIBLE CHANGES
o All functionalities related to XVector objects have been moved to the
new XVector package.
o Refine how isDisjoint() handles empty ranges.
o Remove 'keepLength' argument from "window<-" methods.
o unlist( , use.names=FALSE) on a CompressedSplitDataFrameList object
now preserves the rownames of the list elements, which is more
consistent with what unlist() does on other CompressedList objects.
o Splitting a list by a Vector just yields a list, not a List.
o The rbind,DataFrame method now handles the case where Rle and vector
columns need to be combined (assuming an equivalence between Rle and
vector). Also the way the result DataFrame is constructed was changed
(avoids undesirable coercions and should be faster).
o as.data.frame.DataFrame now passes 'stringsAsFactors=FALSE' and
'check.names=!optional' to the underlying data.frame() call.
as(x,"DataFrame") sets 'optional=TRUE' when delegating. Most places
where we called as.data.frame(), we now call 'as(x,"data.frame")'.
o The [<-,DataFrame method now coerces column sub-replacement value to
class of column when the column already exists.
o DataFrame() now automatically derives rownames (from the first argument
that has some). This is a fairly significant change in behavior, but it
probably does better match user behavior.
o Make sure that SimpleList objects are coerced to a DataFrame with a
single column. The automatic coecion methods created by the methods
package were trying to create a DataFrame with one column per element,
because DataFrame extends SimpleList.
o Change default to 'compress=TRUE' for RleList() constructor.
o tapply() now handles the case where only INDEX is a Vector (e.g.
an Rle object).
o Speedup coverage() in the "tiling case" (i.e. when 'x' is a tiling
of the [1, width] interval). This makes it much faster to turn into an
Rle a coverage loaded from a BigWig, WIG or BED as a GRanges object.
o Allow logical Rle return values from filter rules.
o FilterRules no longer requires its elements to be named.
o The select,Vector method now returns a DataFrame even when a single
column is selected.
o Move is.unsorted() generic to BiocGenerics.