Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cid-fmt Enhancments #61

Merged
merged 8 commits into from
Aug 10, 2018
Merged

cid-fmt Enhancments #61

merged 8 commits into from
Aug 10, 2018

Conversation

kevina
Copy link
Contributor

@kevina kevina commented Jul 31, 2018

Refactor and enhance cid-fmt and extract most of it from the main package so the functionality can be used as a library.

Closes #41.

@ghost ghost assigned kevina Jul 31, 2018
@ghost ghost added the status/in-progress In progress label Jul 31, 2018
@kevina kevina requested a review from Stebalien August 1, 2018 02:47
@kevina
Copy link
Contributor Author

kevina commented Aug 1, 2018

Note: It adds a new Format function that conflicts with the name currently used in #53. However I think Format is a better name for this than for the new interface in #53.

However, I am also thinking that this Format function doesn't really belong in the main cid package and should be moved to a separate package perhaps called cidutil and located in the util/ directory of go-cid (so it will be imported with import cidutil "github.com/ipfs/go-cid/util).

Thoughts?

@kevina
Copy link
Contributor Author

kevina commented Aug 1, 2018

Note #60 should go in first.

Edit: No longer necessary.

@kevina kevina added the status/blocked Unable to be worked further until needs are met label Aug 4, 2018
@kevina
Copy link
Contributor Author

kevina commented Aug 4, 2018

Blocked #64 as it will make part of this code cleaner.

Edit: Not really a requirement.

@kevina kevina removed the status/blocked Unable to be worked further until needs are met label Aug 9, 2018
@kevina
Copy link
Contributor Author

kevina commented Aug 9, 2018

With the new Format interface renames to Builder this can go in as is now. It still might be useful to extract some of this code to a new package but we can do that later.

@kevina
Copy link
Contributor Author

kevina commented Aug 10, 2018

@Stebalien I would like to get this in about the same time we get #53 in.

@@ -234,6 +234,28 @@ func Decode(v string) (*Cid, error) {
return Cast(data)
}

// Extract the encoding from a Cid. If Decode on the same string did
// not return an error neither will this function.
func ExtractEncoding(v string) (mbase.Encoding, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about providing a "decode" function that also returns the encoding (e.g., DecodeWithEncoding)? Decode could then be modified to just call that internally.

(not a strong opinion, I just feel it would be slightly more useful/efficient as one usually wants to do both at the same time)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that at first. The ExtractEncoding is useful when you just want the encoding and it is fairly efferent and should not allocate any heap memory so the cost of the extra function call should be negligible.


// Format formats a cid according to the format specificer as
// documented in the FormatRef constant
func Format(fmtStr string, base mb.Encoding, cid *Cid) (string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on mirroring (mostly) fmt?

func Fprint(w io.Writer, base mb.Encoding, cid *Cid)
func Fprintln(w io.Writer, base mb.Encoding, cid *Cid) (int, error)
func Fprintf(w io.Writer, format string, base mb.Encoding, cid *Cid) (int, error)
func Print(base mb.Encoding, cid *Cid) (int, error)
func Println(base mb.Encoding, cid *Cid) (int, error)
func Printf(format string, base mb.Encoding, cid *Cid) (int, error)
func Sprint(base mb.Encoding, cid *Cid) (string, error) // should this return an error?
func Sprintln(base mb.Encoding, cid *Cid) (string, error)
func Sprintf(format string, base mb.Encoding, cid *Cid) (string, error)

This may not really be worth it. We can always add that later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is an overkill. :) A better model might be the time package. I could make Format a method, but I also think something like this should not be port of the core Cid functionally and possible, at a later date, moved to an external package.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... probably.

%h multihash name
%H multihash code
%L hash digest length
%m multihash encoded in base %b (with multibase prefix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be kind of nice to support %(base)mwhere the (base) part is optional. That would, e.g., allow cid-fmt '%(base16)d' "$MYCID" to get a hex hash digest.

However, we can punt on that and figure it out later (just thought I'd note it down). We'd still want to be able to pass the default base to the format function (even if it's 0x0).

format.go Outdated
// documented in the FormatRef constant
func Format(fmtStr string, base mb.Encoding, cid *Cid) (string, error) {
p := cid.Prefix()
out := new(bytes.Buffer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to allocate here. We can just put this on the stack. It even has a small internal 64byte buffer to avoid allocations when formatting small things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay fixed.

func hashToString(num uint64) string {
name, ok := mh.Codes[num]
if !ok {
return fmt.Sprintf("hash?%d", num)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, for performance, I'd rewrite these helpers to write directly to the buffer (e.g., fmtHash(b bytes.Buffer, num uint64)). However, we can always revisit this if it becomes a performance issue (it won't be for the moment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most cases a string from the look table is returned. Sprintf is only for the rare corner case when an encoding was used that is not in the table.

@Stebalien
Copy link
Member

LGTM. I have a few small suggestions but I'd be happy merging as is.

@kevina kevina merged commit a8ae38c into master Aug 10, 2018
@ghost ghost removed the status/in-progress In progress label Aug 10, 2018
@Stebalien Stebalien deleted the kevina/cid-fmt-enhan branch August 10, 2018 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants