Skip to content

Commit

Permalink
Improve content parsing
Browse files Browse the repository at this point in the history
This makes content parsing a bit more generic, to set us up for more content types. Still needs some work though, not 100% ready.
  • Loading branch information
LucasPickering committed Feb 17, 2024
1 parent e122869 commit 79aa0eb
Show file tree
Hide file tree
Showing 12 changed files with 321 additions and 83 deletions.
2 changes: 2 additions & 0 deletions docs/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
- [Command Line Interface (CLI)](./user_guide/cli.md)
- [Templates](./user_guide/templates.md)
- [Collection Reuse & Inheritance](./user_guide/inheritance.md)
- [Data Filtering & Querying](./user_guide/filter_query.md)

# API Reference

Expand All @@ -22,3 +23,4 @@
- [Chain](./api/chain.md)
- [Chain Source](./api/chain_source.md)
- [Template](./api/template.md)
- [Content Type](./api/content_type.md)
11 changes: 6 additions & 5 deletions docs/src/api/chain.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,12 @@ To use a chain in a template, reference it as `{{chains.<id>}}`.

## Fields

| Field | Type | Description | Default |
| ----------- | -------------------------------------------------------------------------------------- | ------------------------------------------------------- | -------- |
| `source` | [`ChainSource`](./chain_source.md) | Source of the chained value | Required |
| `sensitive` | `boolean` | Should the value be hidden in the UI? | `false` |
| `selector` | [`JSONPath`](https://www.ietf.org/archive/id/draft-goessner-dispatch-jsonpath-00.html) | Selector to narrow down results in a chained JSON value | `null` |
| Field | Type | Description | Default |
| -------------- | -------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | -------- |
| `source` | [`ChainSource`](./chain_source.md) | Source of the chained value | Required |
| `sensitive` | `boolean` | Should the value be hidden in the UI? | `false` |
| `selector` | [`JSONPath`](https://www.ietf.org/archive/id/draft-goessner-dispatch-jsonpath-00.html) | Selector to transform/narrow down results in a chained value. See [Filtering & Querying](../user_guide/filter_query.md) | `null` |
| `content_type` | [`ContentType`](./content_type.md) | Force content type. Not required for `request` and `file` chains, as long as the `Content-Type` header/file extension matches the data | |

See the [`ChainSource`](./chain_source.md) docs for more detail.

Expand Down
11 changes: 11 additions & 0 deletions docs/src/api/content_type.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Content Type

Content type defines the various data formats that Slumber recognizes and can manipulate. Slumber is capable of displaying any text-based data format, but only specific formats support additional features such as [querying](../user_guide/filter_query.md) and formatting.

For chained requests, Slumber uses the [HTTP `Content-Type` header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type) to detect the content type. For chained files, it uses the file extension. For other [chain sources](./chain_source.md), or if the `Content-Type` header/file extension is missing or incorrect, you'll have to manually provide the content type via the [chain](./chain.md) `content_type` field.

## Supported Content Types

| Content Type | HTTP Header | File Extension(s) |
| ------------ | ------------------ | ----------------- |
| JSON | `application/json` | `json` |
66 changes: 66 additions & 0 deletions docs/src/user_guide/filter_query.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Data Filtering & Querying

Slumber supports querying data structures to transform or reduce response data.

There are two main use cases for querying:

- In [chained template values](../api/chain.md), to extract data
- Provided via chain's `selector` argument
- In the TUI response body browser, to limit the response data shown

**Regardless of data format, querying is done via [JSONPath](https://www.ietf.org/archive/id/draft-goessner-dispatch-jsonpath-00.html).** For non-JSON formats, the data will be converted to JSON, queried, and converted back. This keeps querying simple and uniform across data types.

## Querying Chained Values

Here's some examples of using queries to extract data from a chained value. Let's say you have two chained value sources. The first is a JSON file, called `creds.json`. It has the following contents:

```json
{ "user": "fishman", "pw": "hunter2" }
```

We'll use these credentials to log in and get an API token, so the second data source is the login response, which looks like so:

```json
{ "token": "abcdef123" }
```

```yaml
chains:
username:
source: !file ./creds.json
selector: $.user
password:
source: !file ./creds.json
selector: $.pw
auth_token:
source: !request login
selector: $.token

# Use YAML anchors for de-duplication
base: &base
headers:
Accept: application/json
Content-Type: application/json

requests:
login:
<<: *base
method: POST
url: "https://myfishes.fish/anything/login"
body: |
{
"username": "{{chains.username}}",
"password": "{{chains.password}}"
}
get_user:
<<: *base
method: GET
url: "https://myfishes.fish/anything/current-user"
query:
auth: "{{chains.auth_token}}"
```
While this example simple extracts inner fields, JSONPath can be used for much more powerful transformations. See the [JSONPath docs](https://www.ietf.org/archive/id/draft-goessner-dispatch-jsonpath-00.html) for more examples.
<!-- TODO add screenshot of in-TUI querying -->
12 changes: 10 additions & 2 deletions src/collection/models.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
//! The plain data types that make up a request collection
use crate::{collection::cereal, template::Template};
use crate::{collection::cereal, http::ContentType, template::Template};
use derive_more::{Deref, Display, From};
use equivalent::Equivalent;
use indexmap::IndexMap;
Expand Down Expand Up @@ -125,8 +125,16 @@ pub struct Chain {
/// Mask chained value in the UI
#[serde(default)]
pub sensitive: bool,
/// JSONpath to extract a value from the response. For JSON data only.
/// Selector to extract a value from the response. This uses JSONPath
/// regardless of the content type. Non-JSON values will be converted to
/// JSON, then converted back. See [ResponseContent::select].
pub selector: Option<JsonPath>,
/// Hard-code the content type of the response. Only needed if a selector
/// is given and the content type can't be dynamically determined
/// correctly. This is needed if the chain source is not an HTTP
/// response (e.g. a file) **or** if the response's `Content-Type` header
/// is incorrect.
pub content_type: Option<ContentType>,
}

/// Unique ID for a chain. Takes a generic param so we can create these during
Expand Down
1 change: 1 addition & 0 deletions src/factory.rs
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ factori!(Chain, {
source = ChainSource::Request(RecipeId::default()),
sensitive = false,
selector = None,
content_type = None,
}
});

Expand Down
161 changes: 118 additions & 43 deletions src/http/parse.rs
Original file line number Diff line number Diff line change
@@ -1,16 +1,30 @@
//! Utilities for parsing response bodies into a variety of known content types.
//! Each supported content type has its own struct which implements
//! [ContentType]. If you want to parse as a statically known content type, just
//! use that struct. If you're want to parse dynamically based on the response's
//! metadata, use [parse_body].
//! [ResponseContent]. If you want to parse as a statically known content type,
//! just use that struct. If you just need to refer to the content _type_, and
//! not a value, use [ContentType]. If you want to parse dynamically based on
//! the response's metadata, use [ContentType::parse_response].
use crate::http::Response;
use anyhow::{anyhow, Context};
use derive_more::Deref;
use std::fmt::Debug;
use derive_more::{Deref, Display};
use serde::{de::IntoDeserializer, Deserialize, Serialize};
use std::{borrow::Cow, ffi::OsStr, fmt::Debug, path::Path, str::FromStr};

/// All supported content types. Each variant should have a corresponding
/// implementation of [ResponseContent].
#[derive(Copy, Clone, Debug, Serialize, Deserialize)]
#[cfg_attr(test, derive(PartialEq))]
pub enum ContentType {
// Primary serialization string here should match the HTTP Content-Type
// header. Others are for file extensions.
#[serde(rename = "application/json", alias = "json")]
Json,
}

/// A response content type that we know how to parse.
pub trait ContentType: Debug {
/// A response content type that we know how to parse. This is defined as a
/// trait rather than an enum because it breaks apart the logic more clearly.
pub trait ResponseContent: Debug + Display {
/// Parse the response body as this type
fn parse(body: &str) -> anyhow::Result<Self>
where
Expand All @@ -21,19 +35,19 @@ pub trait ContentType: Debug {
/// though!
fn prettify(&self) -> String;

/// Convert the content to JSON. JSON is the common language used for
/// querying intenally, so everything needs to be convertible to/from JSON.
fn to_json(&self) -> Cow<'_, serde_json::Value>;

/// Facilitate downcasting generic parsed bodies to concrete types for tests
#[cfg(test)]
fn as_any(&self) -> &dyn std::any::Any;
}

#[derive(Debug, Deref, PartialEq)]
#[derive(Debug, Display, Deref, PartialEq)]
pub struct Json(serde_json::Value);

impl Json {
pub const HEADER: &'static str = "application/json";
}

impl ContentType for Json {
impl ResponseContent for Json {
fn parse(body: &str) -> anyhow::Result<Self> {
Ok(Self(serde_json::from_str(body)?))
}
Expand All @@ -43,41 +57,84 @@ impl ContentType for Json {
serde_json::to_string_pretty(&self.0).unwrap()
}

fn to_json(&self) -> Cow<'_, serde_json::Value> {
Cow::Borrowed(&self.0)
}

#[cfg(test)]
fn as_any(&self) -> &dyn std::any::Any {
self as &dyn std::any::Any
}
}

/// Helper for parsing the body of a response. Use [Response::parse_body] for
/// external usage.
pub(super) fn parse_body(
response: &Response,
) -> anyhow::Result<Box<dyn ContentType>> {
let body = &response.body;
match get_content_type(response)? {
Json::HEADER => Ok(Box::new(Json::parse(body.text())?)),
other => Err(anyhow!("Response has unknown content-type `{other}`",)),
impl ContentType {
/// Parse some content of this type. Return a dynamically dispatched content
/// object.
pub fn parse(
self,
content: &str,
) -> anyhow::Result<Box<dyn ResponseContent>> {
match self {
Self::Json => Ok(Box::new(Json::parse(content)?)),
}
}

/// Parse content from JSON into this format. Valid JSON should be valid
/// in any other format too, so this is infallible.
pub fn parse_json(
self,
content: &serde_json::Value,
) -> Box<dyn ResponseContent> {
match self {
Self::Json => Box::new(Json(content.clone())),
}
}

/// Helper for parsing the body of a response. Use [Response::parse_body]
/// for external usage.
pub(super) fn parse_response(
response: &Response,
) -> anyhow::Result<Box<dyn ResponseContent>> {
Self::from_header(response)?.parse(response.body.text())
}

/// Parse the content type from a file's extension
pub fn from_extension(path: &Path) -> anyhow::Result<Self> {
path.extension()
.and_then(OsStr::to_str)
.ok_or_else(|| anyhow!("Path {path:?} has no extension"))?
.parse()
}

/// Parse the content type from a response's `Content-Type` header
pub fn from_header(response: &Response) -> anyhow::Result<Self> {
// If the header value isn't utf-8, we're hosed
let header_value =
std::str::from_utf8(response.content_type().ok_or_else(|| {
anyhow!("Response has no content-type header")
})?)
.context("content-type header is not valid utf-8")?;

// Remove extra metadata from the header. It feels like there should be
// a helper for this in hyper or reqwest but I couldn't find it.
// https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type
let content_type = header_value
.split_once(';')
.map(|t| t.0)
.unwrap_or(header_value);

content_type.parse()
}
}

/// Parse the content type from a response's headers
fn get_content_type(response: &Response) -> anyhow::Result<&str> {
// If the header value isn't utf-8, we're hosed
let header_value = std::str::from_utf8(
response
.content_type()
.ok_or_else(|| anyhow!("Response has no content-type header"))?,
)
.context("content-type header is not valid utf-8")?;

// Remove extra metadata from the header. It feels like there should be a
// helper for this in hyper or reqwest but I couldn't find it.
// https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type
Ok(header_value
.split_once(';')
.map(|t| t.0)
.unwrap_or(header_value))
impl FromStr for ContentType {
type Err = anyhow::Error;

fn from_str(s: &str) -> Result<Self, Self::Err> {
// Lean on serde for parsing
ContentType::deserialize(s.into_deserializer())
.map_err(serde::de::value::Error::into)
}
}

#[cfg(test)]
Expand All @@ -92,6 +149,24 @@ mod tests {
use serde_json::json;
use std::ops::Deref;

#[test]
fn test_from_extension() {
assert_eq!(
ContentType::from_extension(Path::new("turbo.json")).unwrap(),
ContentType::Json
);

// Errors
assert_err!(
ContentType::from_extension(Path::new("no_extension")),
"no extension"
);
assert_err!(
ContentType::from_extension(Path::new("turbo.ohno")),
"unknown variant `ohno`"
)
}

/// Test all content types
#[rstest]
#[case(
Expand All @@ -105,7 +180,7 @@ mod tests {
"{\"hello\": \"goodbye\"}",
Json(json!({"hello": "goodbye"}))
)]
fn test_parse_body<T: ContentType + PartialEq + 'static>(
fn test_parse_body<T: ResponseContent + PartialEq + 'static>(
#[case] content_type: &str,
#[case] body: String,
#[case] expected: T,
Expand All @@ -114,7 +189,7 @@ mod tests {
Response, headers: headers(content_type), body: body.into()
);
assert_eq!(
parse_body(&response)
ContentType::parse_response(&response)
.unwrap()
.deref()
// Downcast the result to desired type
Expand All @@ -128,7 +203,7 @@ mod tests {
/// Test various failure cases
#[rstest]
#[case(None::<&str>, "", "no content-type header")]
#[case(Some("bad-header"), "", "unknown content-type")]
#[case(Some("bad-header"), "", "unknown variant `bad-header`")]
#[case(Some(b"\xc3\x28".as_slice()), "", "not valid utf-8")]
#[case(Some("application/json"), "not json!", "expected ident")]
fn test_parse_body_error<
Expand All @@ -143,7 +218,7 @@ mod tests {
None => HeaderMap::new(),
};
let response = create!(Response, headers: headers, body: body.into());
assert_err!(parse_body(&response), expected_error);
assert_err!(ContentType::parse_response(&response), expected_error);
}

/// Create header map with the given value for the content-type header
Expand Down
6 changes: 3 additions & 3 deletions src/http/record.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
use crate::{
collection::{ProfileId, RecipeId},
http::{parse, ContentType},
http::{ContentType, ResponseContent},
util::ResultExt,
};
use anyhow::Context;
Expand Down Expand Up @@ -124,8 +124,8 @@ pub struct Response {

impl Response {
/// Parse the body of this response, based on its `content-type` header
pub fn parse_body(&self) -> anyhow::Result<Box<dyn ContentType>> {
parse::parse_body(self)
pub fn parse_body(&self) -> anyhow::Result<Box<dyn ResponseContent>> {
ContentType::parse_response(self)
.context("Error parsing response body")
.traced()
}
Expand Down
Loading

0 comments on commit 79aa0eb

Please sign in to comment.