Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content iterator #82

Merged
merged 33 commits into from
May 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
9016389
content iterator WIP
chocolatkey Mar 27, 2023
e16f89f
add more manifest utilities
chocolatkey Mar 28, 2023
e02ade9
add CSS selector implementation
chocolatkey Jun 9, 2023
a7a711a
HTML content iterator MVP
chocolatkey Jun 13, 2023
c64ff2e
fix content
chocolatkey Jun 13, 2023
5b51db4
fix typo
chocolatkey Jun 21, 2023
f6d7b1f
bugfix
chocolatkey Sep 19, 2023
f74288b
improvements to htmlconverter
chocolatkey Sep 19, 2023
314eb33
update pdfcpu api
chocolatkey Sep 26, 2023
15b98dc
minimizeReads for stream zip entry reading
chocolatkey Sep 27, 2023
f277ad9
bump up to go 1.21
chocolatkey Sep 27, 2023
c652c01
fix zipped range reading bug
chocolatkey Sep 27, 2023
29a4d17
add safety fix
chocolatkey Sep 27, 2023
a8f16d9
support range reads with minimizeReads
chocolatkey Sep 27, 2023
54f6297
bump action to go 1.21
chocolatkey Sep 27, 2023
3c1221a
fix bytesresource
chocolatkey Sep 27, 2023
e0e64ef
fix bug in JSON marshalling
chocolatkey Oct 6, 2023
221ff3b
fix font deobfuscating bug
chocolatkey Oct 6, 2023
9a1dbc5
deduplicate contributors in WebPub JSON
chocolatkey Mar 13, 2024
a95526f
use atoms instead of strings for inlineTags
chocolatkey Mar 13, 2024
31bb79a
fix bug in parsing of displayoptions XML
chocolatkey Mar 13, 2024
4c3038f
fix mediatype test failing
chocolatkey Mar 21, 2024
f3c807d
Fix concurrent R/W access to link properties (#87)
chocolatkey May 18, 2024
ba558c9
Merge branch 'main' into content-iterator
chocolatkey May 18, 2024
51f1f66
add mimetype sniffing for audio
chocolatkey May 22, 2024
ef1cdc2
Make contentImplementation private
chocolatkey May 22, 2024
f0963ce
move CSSSelector utility to internal pkg
chocolatkey May 22, 2024
22cc345
Merge branch 'content-iterator' of github.com:readium/go-toolkit into…
chocolatkey May 22, 2024
e928aaa
fix missing refactor instances
chocolatkey May 22, 2024
58327b0
Use AttributesHolder for Attributes function of Element interface
chocolatkey May 26, 2024
83f6dd7
Fix properties access for ArchiveEntryLength
chocolatkey May 27, 2024
d0df31d
fix further bugs in resource properties
chocolatkey May 27, 2024
5139988
Adjust attributes interface
chocolatkey May 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v2
with:
go-version: 1.18
go-version: 1.21

- name: Build
run: go build -v ./...
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
- run: git fetch --force --tags
- uses: actions/setup-go@v3
with:
go-version: '>=1.20.0'
go-version: '>=1.21.0'
cache: true
- uses: goreleaser/goreleaser-action@v4
with:
Expand Down
17 changes: 9 additions & 8 deletions go.mod
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
module github.com/readium/go-toolkit

go 1.18
go 1.21

require (
github.com/agext/regexp v1.3.0
github.com/andybalholm/cascadia v1.3.2
github.com/deckarep/golang-set v1.7.1
github.com/gorilla/mux v1.7.4
github.com/opds-community/libopds2-go v0.0.0-20170628075933-9c163cf60f6e
github.com/pdfcpu/pdfcpu v0.3.13
github.com/pdfcpu/pdfcpu v0.5.0
github.com/pkg/errors v0.9.1
github.com/readium/xmlquery v0.0.0-20230106230237-8f493145aef4
github.com/relvacode/iso8601 v1.1.0
Expand All @@ -18,8 +19,8 @@ require (
github.com/stretchr/testify v1.7.0
github.com/trimmer-io/go-xmp v1.0.0
github.com/urfave/negroni v1.0.0
golang.org/x/net v0.7.0
golang.org/x/text v0.7.0
golang.org/x/net v0.10.0
golang.org/x/text v0.12.0
)

require (
Expand All @@ -29,8 +30,8 @@ require (
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
github.com/gopherjs/gopherjs v0.0.0-20190910122728-9d188e94fb99 // indirect
github.com/hashicorp/hcl v1.0.0 // indirect
github.com/hhrutter/lzw v0.0.0-20190829144645-6f07a24e8650 // indirect
github.com/hhrutter/tiff v0.0.0-20190829141212-736cae8d0bc7 // indirect
github.com/hhrutter/lzw v1.0.0 // indirect
github.com/hhrutter/tiff v1.0.1 // indirect
github.com/inconshreveable/mousetrap v1.0.1 // indirect
github.com/magiconair/properties v1.8.5 // indirect
github.com/mitchellh/mapstructure v1.4.1 // indirect
Expand All @@ -40,8 +41,8 @@ require (
github.com/spf13/cast v1.3.1 // indirect
github.com/spf13/jwalterweatherman v1.1.0 // indirect
github.com/subosito/gotenv v1.2.0 // indirect
golang.org/x/image v0.5.0 // indirect
golang.org/x/sys v0.5.0 // indirect
golang.org/x/image v0.11.0 // indirect
golang.org/x/sys v0.8.0 // indirect
gopkg.in/ini.v1 v1.62.0 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
Expand Down
39 changes: 25 additions & 14 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03
github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802/go.mod h1:IVnqGOEym/WlBOVXweHU+Q+/VP0lqqI8lqeDx9IjBqo=
github.com/agext/regexp v1.3.0 h1:6+9tp+S41TU48gFNV47bX+pp1q7WahGofw6JccmsCDs=
github.com/agext/regexp v1.3.0/go.mod h1:6phv1gViOJXWcTfpxOi9VMS+MaSAo+SUDf7do3ur1HA=
github.com/andybalholm/cascadia v1.3.2 h1:3Xi6Dw5lHF15JtdcmAHD3i1+T8plmv7BQ/nsViSLyss=
github.com/andybalholm/cascadia v1.3.2/go.mod h1:7gtRlve5FxPPgIgX36uWBX58OdBsSS6lUvCFb+h7KvU=
github.com/antchfx/xpath v1.2.1 h1:qhp4EW6aCOVr5XIkT+l6LJ9ck/JsUH/yyauNgTQkBF8=
github.com/antchfx/xpath v1.2.1/go.mod h1:i54GszH55fYfBmoZXapTHN8T8tkcHfRgLyVwwqzXNcs=
github.com/antihax/optional v1.0.0/go.mod h1:uupD/76wgC+ih3iEmQUL+0Ugr19nfwCT1kdvxnR2qWY=
Expand Down Expand Up @@ -171,11 +173,10 @@ github.com/hashicorp/logutils v1.0.0/go.mod h1:QIAnNjmIWmVIIkWDTG1z5v++HQmx9WQRO
github.com/hashicorp/mdns v1.0.0/go.mod h1:tL+uN++7HEJ6SQLQ2/p+z2pH24WQKWjBPkE0mNTz8vQ=
github.com/hashicorp/memberlist v0.1.3/go.mod h1:ajVTdAv/9Im8oMAAj5G31PhhMCZJV2pPBoIllUwCN7I=
github.com/hashicorp/serf v0.8.2/go.mod h1:6hOLApaqBFA1NXqRQAsxw9QxuDEvNxSQRwA/JwenrHc=
github.com/hhrutter/lzw v0.0.0-20190827003112-58b82c5a41cc/go.mod h1:yJBvOcu1wLQ9q9XZmfiPfur+3dQJuIhYQsMGLYcItZk=
github.com/hhrutter/lzw v0.0.0-20190829144645-6f07a24e8650 h1:1yY/RQWNSBjJe2GDCIYoLmpWVidrooriUr4QS/zaATQ=
github.com/hhrutter/lzw v0.0.0-20190829144645-6f07a24e8650/go.mod h1:yJBvOcu1wLQ9q9XZmfiPfur+3dQJuIhYQsMGLYcItZk=
github.com/hhrutter/tiff v0.0.0-20190829141212-736cae8d0bc7 h1:o1wMw7uTNyA58IlEdDpxIrtFHTgnvYzA8sCQz8luv94=
github.com/hhrutter/tiff v0.0.0-20190829141212-736cae8d0bc7/go.mod h1:WkUxfS2JUu3qPo6tRld7ISb8HiC0gVSU91kooBMDVok=
github.com/hhrutter/lzw v1.0.0 h1:laL89Llp86W3rRs83LvKbwYRx6INE8gDn0XNb1oXtm0=
github.com/hhrutter/lzw v1.0.0/go.mod h1:2HC6DJSn/n6iAZfgM3Pg+cP1KxeWc3ezG8bBqW5+WEo=
github.com/hhrutter/tiff v1.0.1 h1:MIus8caHU5U6823gx7C6jrfoEvfSTGtEFRiM8/LOzC0=
github.com/hhrutter/tiff v1.0.1/go.mod h1:zU/dNgDm0cMIa8y8YwcYBeuEEveI4B0owqHyiPpJPHc=
github.com/ianlancetaylor/demangle v0.0.0-20181102032728-5e5cf60278f6/go.mod h1:aSSvb/t6k1mPoxDqO4vJh6VOCGPwU4O0C2/Eqndh1Sc=
github.com/ianlancetaylor/demangle v0.0.0-20200824232613-28f6c0f3b639/go.mod h1:aSSvb/t6k1mPoxDqO4vJh6VOCGPwU4O0C2/Eqndh1Sc=
github.com/inconshreveable/mousetrap v1.0.1 h1:U3uMjPSQEBMNp1lFxmllqCPM6P5u/Xq7Pgzkat/bFNc=
Expand Down Expand Up @@ -213,8 +214,8 @@ github.com/modern-go/reflect2 v1.0.1/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3Rllmb
github.com/opds-community/libopds2-go v0.0.0-20170628075933-9c163cf60f6e h1:kjurmIVxVypqhb5CUAG9jLhYL1TLsUE47KfoEm7cdlE=
github.com/opds-community/libopds2-go v0.0.0-20170628075933-9c163cf60f6e/go.mod h1:U/OpXIq9O6FgLfzvun31PZt8iIlbG93BieaxjOEIAd0=
github.com/pascaldekloe/goe v0.0.0-20180627143212-57f6aae5913c/go.mod h1:lzWF7FIEvWOWxwDKqyGYQf6ZUaNfKdP144TG7ZOy1lc=
github.com/pdfcpu/pdfcpu v0.3.13 h1:VFon2Yo1PJt+sA57vPAeXWGLSZ7Ux3Jl4h02M0+s3dg=
github.com/pdfcpu/pdfcpu v0.3.13/go.mod h1:UJc5xsXg0fpmjp1zOPdyYcAQArc/Zf3V0nv5URe+9fg=
github.com/pdfcpu/pdfcpu v0.5.0 h1:F3wC4bwPbaJM+RPgm1D0Q4SAUwxElw7BhwNvL3iPgDo=
github.com/pdfcpu/pdfcpu v0.5.0/go.mod h1:UPcHdWcMw1V6Bo5tcWHd3jZfkG8cwUwrJkQOlB6o+7g=
github.com/pelletier/go-toml v1.9.3 h1:zeC5b1GviRUyKYd6OJPvBU/mcVDVoL1OhT17FCt5dSQ=
github.com/pelletier/go-toml v1.9.3/go.mod h1:u1nR/EPcESfeI/szUZKdtJ0xRNbUoANCkoOuaOx1Y+c=
github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
Expand Down Expand Up @@ -305,9 +306,8 @@ golang.org/x/exp v0.0.0-20200207192155-f17229e696bd/go.mod h1:J/WKrq2StrnmMY6+EH
golang.org/x/exp v0.0.0-20200224162631-6cc2880d07d6/go.mod h1:3jZMyOhIsHpP37uCMkUooju7aAi5cS1Q23tOzKc+0MU=
golang.org/x/image v0.0.0-20190227222117-0694c2d4d067/go.mod h1:kZ7UVZpmo3dzQBMxlp+ypCbDeSB+sBbTgSJuh5dn5js=
golang.org/x/image v0.0.0-20190802002840-cff245a6509b/go.mod h1:FeLwcggjj3mMvU+oOTbSwawSJRM1uh48EjtB4UJZlP0=
golang.org/x/image v0.0.0-20190823064033-3a9bac650e44/go.mod h1:FeLwcggjj3mMvU+oOTbSwawSJRM1uh48EjtB4UJZlP0=
golang.org/x/image v0.5.0 h1:5JMiNunQeQw++mMOz48/ISeNu3Iweh/JaZU8ZLqHRrI=
golang.org/x/image v0.5.0/go.mod h1:FVC7BI/5Ym8R25iw5OLsgshdUBbT1h5jZTpA+mvAdZ4=
golang.org/x/image v0.11.0 h1:ds2RoQvBvYTiJkwpSFDwCcDFNX7DqjL2WsUgTNk0Ooo=
golang.org/x/image v0.11.0/go.mod h1:bglhjqbqVuEb9e9+eNR45Jfu7D+T4Qan+NhQk8Ck2P8=
golang.org/x/lint v0.0.0-20181026193005-c67002cb31c3/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE=
golang.org/x/lint v0.0.0-20190227174305-5b3e6a55c961/go.mod h1:wehouNa3lNwaWXcvxsM5YxQ5yQlVC4a0KAMCusXpPoU=
golang.org/x/lint v0.0.0-20190301231843-5614ed5bae6f/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE=
Expand All @@ -332,6 +332,7 @@ golang.org/x/mod v0.4.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.4.1/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.4.2/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4=
golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/net v0.0.0-20180724234803-3673e40ba225/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20180826012351-8a410e7b638d/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20181023162649-9b4f9f5ad519/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
Expand Down Expand Up @@ -370,8 +371,10 @@ golang.org/x/net v0.0.0-20210316092652-d523dce5a7f4/go.mod h1:RBQZq4jEuRlivfhVLd
golang.org/x/net v0.0.0-20210405180319-a5a99cb37ef4/go.mod h1:p54w0d4576C0XHj96bSt6lcn1PtDYWL6XObtHCRCNQM=
golang.org/x/net v0.0.0-20220127200216-cd36cc0744dd/go.mod h1:CfG3xpIq0wQ8r1q4Su4UZFWDARRcnwPjda9FqA0JpMk=
golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c=
golang.org/x/net v0.7.0 h1:rJrUqqhjsgNp7KqAIc25s9pZnjU7TUcSY7HcVZjdn1g=
golang.org/x/net v0.7.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/net v0.6.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/net v0.9.0/go.mod h1:d48xBJpPfHeWQsugry2m+kC02ZBRGRgulfHnEXEuWns=
golang.org/x/net v0.10.0 h1:X2//UzNDwYmtCLn7To6G58Wr6f5ahEAQgKNzv9Y951M=
golang.org/x/net v0.10.0/go.mod h1:0qNGK6F8kojg2nk9dLZ2mShWaEBan6FAoqfSigmmuDg=
golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U=
golang.org/x/oauth2 v0.0.0-20190226205417-e64efc72b421/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
Expand All @@ -396,6 +399,7 @@ golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJ
golang.org/x/sync v0.0.0-20201207232520-09787c993a3a/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sys v0.0.0-20180823144017-11551d06cbcc/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20181026203630-95b1ffbd15a5/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
Expand Down Expand Up @@ -442,10 +446,14 @@ golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBc
golang.org/x/sys v0.0.0-20211216021012-1d35b9e2eb4e/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.5.0 h1:MUK/U/4lj1t1oPg0HfuXDN/Z1wv31ZJ/YcPiGccS4DU=
golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.7.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.8.0 h1:EBmGv8NaZBZTWvrbjNoL6HVt+IVy3QDQpJs7VRIw3tU=
golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k=
golang.org/x/term v0.7.0/go.mod h1:P32HKFT3hSsZrRxla30E9HqToFYAQPCMs/zFMBUFqPY=
golang.org/x/text v0.0.0-20170915032832-14c0d48ead0c/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.1-0.20180807135948-17ff2d5776d2/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
Expand All @@ -454,8 +462,10 @@ golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.4/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.5/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
golang.org/x/text v0.7.0 h1:4BRB4x83lYWy72KwLD/qYDuTu7q9PjSagHvijDw7cLo=
golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/text v0.9.0/go.mod h1:e1OnstbJyHTd6l/uOt8jFFHp6TRDWZR/bV3emEE/zU8=
golang.org/x/text v0.12.0 h1:k+n5B8goJNdU7hSvEtMUz3d1Q6D/XW4COJSJR6fN0mc=
golang.org/x/text v0.12.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE=
golang.org/x/time v0.0.0-20181108054448-85acf8d2951c/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.0.0-20191024005414-555d28b269f0/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
Expand Down Expand Up @@ -511,6 +521,7 @@ golang.org/x/tools v0.0.0-20210106214847-113979e3529a/go.mod h1:emZCQorbCU4vsT4f
golang.org/x/tools v0.1.0/go.mod h1:xkSsbof2nBLbhDlRMhhhyNLN/zl3eTqcnHD5viDpcZ0=
golang.org/x/tools v0.1.2/go.mod h1:o0xws9oXOQQZyjljx8fwUC0k7L1pTE6eaCbjGeHmOkk=
golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc=
golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
Expand Down
68 changes: 45 additions & 23 deletions pkg/archive/archive_zip.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,20 +31,20 @@ func (e gozipArchiveEntry) CompressedLength() uint64 {
return e.file.CompressedSize64
}

// This is a special mode to minimize the number of reads from the underlying reader.
// It's especially useful when trying to stream the ZIP from a remote file, e.g.
// cloud storage. It's only enabled when trying to read the entire file and compression
// is enabled. Care needs to be taken to cover every edge case.
func (e gozipArchiveEntry) couldMinimizeReads() bool {
return e.minimizeReads && e.CompressedLength() > 0
}

func (e gozipArchiveEntry) Read(start int64, end int64) ([]byte, error) {
if end < start {
return nil, errors.New("range not satisfiable")
}

minimizeReads := false
if e.CompressedLength() > 0 && e.minimizeReads && start == 0 && end == 0 {
// This is a special mode to minimize the number of reads from the underlying reader.
// It's especially useful when trying to stream the ZIP from a remote file, e.g.
// cloud storage. It's only enabled when trying to read the entire file and compression
// is enabled. Maybe at some point it should be enabled for range reads as well, but
// that's something that depends on the usecase, size of the file etc. so it's off for now.
minimizeReads = true
}
minimizeReads := e.couldMinimizeReads()

var f io.Reader
var err error
Expand All @@ -70,13 +70,10 @@ func (e gozipArchiveEntry) Read(start int64, end int64) ([]byte, error) {
}
frdr := flate.NewReader(bytes.NewReader(compressedData))
defer frdr.Close()
data := make([]byte, e.file.UncompressedSize64)
_, err = io.ReadFull(frdr, data)
if err != nil {
return nil, err
}
return data, nil
} else if start == 0 && end == 0 {
f = frdr
}

if start == 0 && end == 0 {
data := make([]byte, e.file.UncompressedSize64)
_, err := io.ReadFull(f, data)
if err != nil {
Expand All @@ -90,23 +87,48 @@ func (e gozipArchiveEntry) Read(start int64, end int64) ([]byte, error) {
return nil, err
}
}
data := make([]byte, end-start+1)
n, err := f.Read(data)
data := make([]byte, min(end-start+1, int64(e.file.UncompressedSize64)))
_, err = io.ReadFull(f, data)
if err != nil {
return nil, err
}
return data[:n], nil
return data, nil
}

func (e gozipArchiveEntry) Stream(w io.Writer, start int64, end int64) (int64, error) {
if end < start {
return -1, errors.New("range not satisfiable")
}
f, err := e.file.Open()
if err != nil {
return -1, err

minimizeReads := e.couldMinimizeReads() && start == 0 && end == 0

var f io.Reader
var err error
if minimizeReads {
f, err = e.file.OpenRaw()
if err != nil {
return -1, err
}
} else {
rc, err := e.file.Open()
if err != nil {
return -1, err
}
defer rc.Close()
f = rc
}

if minimizeReads {
compressedData := make([]byte, e.file.CompressedSize64)
_, err := io.ReadFull(f, compressedData)
if err != nil {
return -1, err
}
frdr := flate.NewReader(bytes.NewReader(compressedData))
defer frdr.Close()
f = frdr
}
defer f.Close()

if start == 0 && end == 0 {
return io.Copy(w, f)
}
Expand Down
53 changes: 53 additions & 0 deletions pkg/content/content.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
package content

import (
"strings"

"github.com/readium/go-toolkit/pkg/content/element"
"github.com/readium/go-toolkit/pkg/content/iterator"
)

type Content interface {
Text(separator *string) (string, error) // Extracts the full raw text, or returns null if no text content can be found.
Iterator() iterator.Iterator // Creates a new iterator for this content.
Elements() ([]element.Element, error) // Returns all the elements as a list.
Comment on lines +11 to +13
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Kotlin toolkit, text() and elements() are in the interface with a default implementation, but they should probably be extensions instead. The only thing that should be required to implement is Iterator.

Maybe we can remove them from the interface and keep only ContentText() and ContentElements() on the side?

Alternatively if we want to keep the API easy to use, Content could be a struct and have a IteratorFactory interface instead.

}

// Extracts the full raw text, or returns null if no text content can be found.
func ContentText(content Content, separator *string) (string, error) {
sep := "\n"
if separator != nil {
sep = *separator
}
var sb strings.Builder
els, err := content.Elements()
if err != nil {
return "", err
}
for _, el := range els {
if txel, ok := el.(element.TextualElement); ok {
txt := txel.Text()
if txt != "" {
sb.WriteString(txel.Text())
sb.WriteString(sep)
}
}
}
return strings.TrimSuffix(sb.String(), sep), nil
}

func ContentElements(content Content) ([]element.Element, error) {
var elements []element.Element
it := content.Iterator()
for {
hasNext, err := it.HasNext()
if err != nil {
return nil, err
}
if !hasNext {
break
}
elements = append(elements, it.Next())
}
return elements, nil
}
Loading
Loading