Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reading pascal string #42

Merged
merged 6 commits into from
Apr 2, 2023
Merged

Conversation

gyk
Copy link
Contributor

@gyk gyk commented Mar 19, 2023

I encountered a PSD file that caused psd crate to panic. The problems are:

This PR attempts to resolve above issues.

Update: Also fixes panicking when the Layer and mask information section is empty (Krita can produce such PSDs).

@chinedufn
Copy link
Owner

Thanks!

We'll need a test case that has a minimal PSD file where

  • Before this commit, we could not read back some string
  • As of this commit, we're able to read back the string

The test can live somewhere in here https://github.com/chinedufn/psd/tree/master/tests

After that we can get this merged.

@gyk
Copy link
Contributor Author

gyk commented Mar 25, 2023

I've added a 1x1 PSD that

  • Contains a non-UTF-8 Pascal string of 10 bytes in an image resource block, which is directly copied from a real world PSD file.
  • Has zero-length layer and mask information section.

Copy link
Owner

@chinedufn chinedufn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for adding a tiny test PSD file. Looks great.


The String::from_utf8_lossy changes look good. (thank you!)


My only concern is that some of the parsing changes aren't tested.

I left questions on the changes that don't appear to be tested.

Thanks for working to put this together!

Comment on lines +161 to +163
if cursor.position() + bytes_to_read as u64 > len {
break;
}
Copy link
Owner

@chinedufn chinedufn Mar 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are passing without this change.
Is this check solving for a real PSD that you came across?
If so can we add a test fixture?
If not, what led you to want to check here to make sure that we aren't reading passed the length? Can we remove this check, or do we need it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is a real PSD file (and its owner has explicitly stated that it is confidential). It can be opened in ImageMagick and Krita, but FFmpeg also complains the file is malformed.
From its XMP metadata I can tell it had been edited by
Photoshop CC 2017 and then Photoshop 23.0 on Windows.

When the code hits this IF branch, cursor.position() and len are both 5964830, and bytes_to_read is 1. The 0 is at the end of the blue channel. I really have no idea why there is an extra byte.

Copy link
Owner

@chinedufn chinedufn Mar 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for explaining.

I think we can unit test this then.

Can add something like this to the bottom of this file:

#[cfg(test)]
mod tests {
    use crate::PsdLayer;
    use super::*;

    /// Verify that when when inserting an RLE channel's bytes into an RGBA byte vec we do not
    /// attempt to read beyond the channel's length.
    #[test]
    fn does_not_read_beyond_rle_channels_bytes() {
        let layer = PsdLayer {
            channels: todo!("Set up some failing data"),
            layer_properties: todo!("Set up some failing data"),
        };
        
        let rgba = todo!("");
        let channel_bytes = todo!("");
        
        layer.insert_channel_bytes(&mut rgba, PsdChannelKind::Red, &channel_bytes);

        assert_eq!(rgba, vec![...]);
    }
}

Then just craft a PsdLayer and rgba and channel_bytes that would fail without your fix.

Can make PsdLayer.channels pub(crate) so that it is visible from here.


Can also add a second test for the other fix below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both tests have been added, please take another look.

Comment on lines +172 to +174
if cursor.position() + 1 > len {
break;
}
Copy link
Owner

@chinedufn chinedufn Mar 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are passing without this change.
Is this check solving for a real PSD that you came across?
If so can we add a test fixture?
If not, what led you to want to check here to make sure that we aren't reading passed the length? Can we remove this check, or do we need it?

let result = String::from_utf8(data.to_vec()).unwrap();
let result = String::from_utf8_lossy(data).into_owned();

if len % 2 == 0 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests still pass without this len % 2 == 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I forgot the problem actually occurred in another image resource block, and the original data contains a string of odd length (13) so the size field following it was incorrectly decoded (0x0005AC00 instead of 0x000005AC):

38 42 49 4D 0B B8 0D 4F 72 69 67 69 6E 44 61 74  8BIM.¸.OriginDat
61 49 52 42 00 00 05 AC 00 00 00 10 00 00 00 01  aIRB...¬........
...

How about removing one byte of space (0x20) from the non-UTF-8 string in the test fixture?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to add a second fixture that is dedicated to this odd length pascal string case?

This way we don't stuff too many different cases into a single fixture.

If you can easily do that then that would be preferred.

If not, we can add a test to the bottom of this file that handles this case (feel free to do both if you want).

Thanks!

cursor.read_4();
let len = cursor.read_u32();

if len == 0 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only change where tests start failing if we comment it out.

Copy link
Owner

@chinedufn chinedufn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for explaining the changes.

Everything that you explained makes sense. I left an idea for a way that we can get this tested.
Once we have some coverage we can merge this and I can deploy a new version.

Comment on lines +161 to +163
if cursor.position() + bytes_to_read as u64 > len {
break;
}
Copy link
Owner

@chinedufn chinedufn Mar 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for explaining.

I think we can unit test this then.

Can add something like this to the bottom of this file:

#[cfg(test)]
mod tests {
    use crate::PsdLayer;
    use super::*;

    /// Verify that when when inserting an RLE channel's bytes into an RGBA byte vec we do not
    /// attempt to read beyond the channel's length.
    #[test]
    fn does_not_read_beyond_rle_channels_bytes() {
        let layer = PsdLayer {
            channels: todo!("Set up some failing data"),
            layer_properties: todo!("Set up some failing data"),
        };
        
        let rgba = todo!("");
        let channel_bytes = todo!("");
        
        layer.insert_channel_bytes(&mut rgba, PsdChannelKind::Red, &channel_bytes);

        assert_eq!(rgba, vec![...]);
    }
}

Then just craft a PsdLayer and rgba and channel_bytes that would fail without your fix.

Can make PsdLayer.channels pub(crate) so that it is visible from here.


Can also add a second test for the other fix below.

let result = String::from_utf8(data.to_vec()).unwrap();
let result = String::from_utf8_lossy(data).into_owned();

if len % 2 == 0 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to add a second fixture that is dedicated to this odd length pascal string case?

This way we don't stuff too many different cases into a single fixture.

If you can easily do that then that would be preferred.

If not, we can add a test to the bottom of this file that handles this case (feel free to do both if you want).

Thanks!

@chinedufn
Copy link
Owner

Thanks for getting this all tested!

@chinedufn chinedufn merged commit 5276ab7 into chinedufn:master Apr 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants