Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the easiest way to read camera animation from a glTF file? #82

Open
Igrium opened this issue Dec 4, 2022 · 5 comments
Open

What's the easiest way to read camera animation from a glTF file? #82

Igrium opened this issue Dec 4, 2022 · 5 comments

Comments

@Igrium
Copy link

Igrium commented Dec 4, 2022

I'm using JglTF to write an importer for camera animation in a piece of software, and I need to read the animation data of a camera from a glTF file. However, all the buffers and buffer views that are part of the API are confusing me. What would be the simplest approach for retrieving this data?

@javagl
Copy link
Owner

javagl commented Dec 5, 2022

The API exposes the data that is contained in the glTF asset. Processing or interpreting that data is largely left to the client. That refers to rendering the data, as well as highly specific tasks like the one that you described.

You are right: The concepts of buffer <- bufferView <- accessor are inconvenient for certain high-level tasks. When you have an accessor and know that it contains, for example, mesh indices, then you don't want to fiddle around with the bufferView and buffer (and possibly sparse accessors), properly interpret the "component types" and handle the byte strides and whatnot. You just want to access the indices, as they are. The classes in the model package offer some convenience for this. For example, when the goal is to just read some accessor data, then you can use the AccessorData interface with its specializations.

But there is another layer of complexity when reading this data - namely, when there are animations involved. A "naive" requirement could be: "I want to read the data, as it is, at a certain state of the animation". But this immediately raises a whole lot of questions: What should be the 'state' (i.e. the animation time)? Which animations should be 'running'? Which part of the data is the one that you actually want to access? (Pure node transforms, or also things like morphed or skinned vertex coordinates?)

As such, this question has some similarities to #80 . Two differences are

  • Your case is "simpler", because you don't want to access the primitive data
  • Your case is "more complex", because you don't want to access plain accessor data, but the result of animating a camera that is attached a node where the (global) node transform may be affected by (multiple?) animations that are encoded in accessors.

I am (roughly) aware of the possible requirements here. As mentioned above, they could be described as the requirement to "access an intermediate 'state' of an animated scene". But due to the subtle, technical details, there is not (yet) a full-fledged, convenient, public API for that.

A basic approach could look as follows - but NOTE that this uses some PRELIMINARY API (and even calls a method that is currently still private, via reflection):

package de.javagl.jgltf.issues;

import java.io.IOException;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

import de.javagl.jgltf.model.AnimationModel;
import de.javagl.jgltf.model.CameraModel;
import de.javagl.jgltf.model.GltfAnimations;
import de.javagl.jgltf.model.GltfModel;
import de.javagl.jgltf.model.NodeModel;
import de.javagl.jgltf.model.animation.Animation;
import de.javagl.jgltf.model.animation.AnimationManager;
import de.javagl.jgltf.model.animation.AnimationManager.AnimationPolicy;
import de.javagl.jgltf.model.io.GltfModelReader;

public class JgltfIssue82_ReadCameraAnimationData
{

    public static void main(String[] args) throws IOException
    {
        String path = "AnimatedCameras.gltf";
        GltfModelReader r = new GltfModelReader();
        GltfModel gltfModel = r.read(Paths.get(path));

        AnimationManager animationManager = createAnimationManager(gltfModel);
        List<CameraInstance> cameraInstances =
            computeCameraInstances(gltfModel);
        printAnimatedCameraTransforms(animationManager, cameraInstances);
    }

    // An "instance" of a camera - namely, a camera and the node 
    // that the camera is attached to
    static class CameraInstance
    {
        NodeModel nodeModel;
        CameraModel cameraModel;
    }

    // Compute all camera instances in the given glTF model
    private static List<CameraInstance>
        computeCameraInstances(GltfModel gltfModel)
    {
        List<CameraInstance> animatedCameras = new ArrayList<CameraInstance>();
        List<NodeModel> nodeModels = gltfModel.getNodeModels();
        for (int i = 0; i < nodeModels.size(); i++)
        {
            NodeModel nodeModel = nodeModels.get(i);
            CameraModel cameraModel = nodeModel.getCameraModel();
            if (cameraModel != null)
            {
                CameraInstance animatedCamera = new CameraInstance();
                animatedCamera.nodeModel = nodeModel;
                animatedCamera.cameraModel = cameraModel;
                animatedCameras.add(animatedCamera);
            }
        }
        return animatedCameras;
    }

    // Create an AnimationManager for ALL animations in the given model.
    // NOTE: This part of the API is HIGHLY preliminary, and may change
    // arbitrarily in future releases!!!
    private static AnimationManager createAnimationManager(GltfModel gltfModel)
    {
        List<AnimationModel> animationModels = gltfModel.getAnimationModels();
        List<Animation> modelAnimations =
            GltfAnimations.createModelAnimations(animationModels);
        AnimationManager animationManager =
            GltfAnimations.createAnimationManager(AnimationPolicy.ONCE);
        animationManager.addAnimations(modelAnimations);
        return animationManager;
    }

    private static void printAnimatedCameraTransforms(
        AnimationManager animationManager, List<CameraInstance> cameraInstances)
    {
        long totalMs = 0;
        
        // Print the state of the given camera instances at certain
        // times of the animation 
        for (int s = 0; s < 10; s++)
        {
            // Perform an animation step
            long msStep = 10;
            long nsStep = msStep * 1000 * 1000;
            callPerformStep(animationManager, nsStep);
            totalMs += msStep;

            // Print the state of the camera instance at the resulting time:
            for (int i = 0; i < cameraInstances.size(); i++)
            {
                CameraInstance cameraInstance = cameraInstances.get(i);
                NodeModel nodeModel = cameraInstance.nodeModel;

                float[] matrix = new float[16];
                nodeModel.computeGlobalTransform(matrix);

                System.out.println("Global transform of camera " + i + " after "
                    + totalMs + "ms is " + Arrays.toString(matrix));
            }
        }
    }

    // HACK! Call AnimationManager#performStep, which currently is 'private'
    private static void callPerformStep(
        AnimationManager animationManager, long ns)
    {
        try
        {
            Method method = AnimationManager.class
                .getDeclaredMethod("performStep", long.class);
            method.setAccessible(true);
            method.invoke(animationManager, ns);
        }
        catch (NoSuchMethodException | SecurityException
            | IllegalAccessException | IllegalArgumentException
            | InvocationTargetException e)
        {
            e.printStackTrace();
        }
    }
}

The following is a glTF asset that contains two cameras that are attached to nodes that are affected by an animation:

{
  "scene": 0,
  "scenes": [
    {
      "nodes": [0, 1]
    }
  ],
  "nodes": [
    {
      "rotation": [-0.383, 0.0, 0.0, 0.92375],
      "mesh": 0
    },
    {
      "children": [2, 3]
    },
    {
      "translation": [0.5, 0.5, 3.0],
      "camera": 0
    },
    {
      "translation": [0.5, 0.5, 3.0],
      "camera": 1
    }
  ],
  "animations": [
    {
      "channels": [
        {
          "sampler": 0,
          "target": {
            "node": 1,
            "path": "rotation"
          }
        }
      ],
      "samplers": [
        {
          "input": 2,
          "interpolation": "LINEAR",
          "output": 3
        }
      ]
    }
  ],

  "cameras": [
    {
      "type": "perspective",
      "perspective": {
        "aspectRatio": 1.0,
        "yfov": 0.7,
        "zfar": 100,
        "znear": 0.01
      }
    },
    {
      "type": "orthographic",
      "orthographic": {
        "xmag": 1.0,
        "ymag": 1.0,
        "zfar": 100,
        "znear": 0.01
      }
    }
  ],

  "meshes": [
    {
      "primitives": [
        {
          "attributes": {
            "POSITION": 1
          },
          "indices": 0,
          "material": 0
        }
      ]
    }
  ],

  "buffers": [
    {
      "uri": "data:application/octet-stream;base64,AAABAAIAAQADAAIAAAAAAAAAAAAAAAAAAACAPwAAAAAAAAAAAAAAAAAAgD8AAAAAAACAPwAAgD8AAAAA",
      "byteLength": 60
    },
    {
      "uri": "data:application/gltf-buffer;base64,AAAAAAAAgD8AAABAAABAQAAAgEAAAAAAAAAAAAAAAAAAAIA/AAAAAPT9ND8AAAAA9P00PwAAAAAAAIA/AAAAAAAAAAAAAAAA9P00PwAAAAD0/TS/AAAAAAAAAAAAAAAAAACAPw==",
      "byteLength": 100
    }
  ],
  "bufferViews": [
    {
      "buffer": 0,
      "byteOffset": 0,
      "byteLength": 12,
      "target": 34963
    },
    {
      "buffer": 0,
      "byteOffset": 12,
      "byteLength": 48,
      "target": 34962
    },
    {
      "name": "TIME bufferView",
      "buffer": 1,
      "byteOffset": 0,
      "byteLength": 20
    },
    {
      "name": "rotation bufferView",
      "buffer": 1,
      "byteOffset": 20,
      "byteLength": 80
    }
  ],
  "accessors": [
    {
      "bufferView": 0,
      "byteOffset": 0,
      "componentType": 5123,
      "count": 6,
      "type": "SCALAR",
      "max": [3],
      "min": [0]
    },
    {
      "bufferView": 1,
      "byteOffset": 0,
      "componentType": 5126,
      "count": 4,
      "type": "VEC3",
      "max": [1.0, 1.0, 0.0],
      "min": [0.0, 0.0, 0.0]
    },
    {
      "bufferView": 2,
      "byteOffset": 0,
      "componentType": 5126,
      "count": 5,
      "type": "SCALAR",
      "max": [4.0],
      "min": [0.0]
    },
    {
      "bufferView": 3,
      "byteOffset": 0,
      "componentType": 5126,
      "count": 5,
      "type": "VEC4",
      "max": [0.0, 1.0, 0.0, 1.0],
      "min": [0.0, 0.0, 0.0, -0.707]
    }
  ],

  "materials": [
    {
      "doubleSided": true,
      "pbrMetallicRoughness": {
        "baseColorFactor": [0, 0, 1, 1],
        "metallicFactor": 0,
        "roughnessFactor": 1
      }
    }
  ],

  "asset": {
    "version": "2.0"
  }
}

Running the program on this asset will print something like this:

Global transform of camera 0 after 10ms is [0.9998767, 0.0, -0.015705595, 0.0, 0.0, 1.0, 0.0, 0.0, 0.015705595, 0.0, 0.9998767, 0.0, 0.5470551, 0.5, 2.9917772, 1.0]
Global transform of camera 1 after 10ms is [0.9998767, 0.0, -0.015705595, 0.0, 0.0, 1.0, 0.0, 0.0, 0.015705595, 0.0, 0.9998767, 0.0, 0.5470551, 0.5, 2.9917772, 1.0]
Global transform of camera 0 after 20ms is [0.99950665, 0.0, -0.031407323, 0.0, 0.0, 1.0, 0.0, 0.0, 0.031407323, 0.0, 0.99950665, 0.0, 0.5939753, 0.5, 2.9828162, 1.0]
Global transform of camera 1 after 20ms is [0.99950665, 0.0, -0.031407323, 0.0, 0.0, 1.0, 0.0, 0.0, 0.031407323, 0.0, 0.99950665, 0.0, 0.5939753, 0.5, 2.9828162, 1.0]
....

I.e. it will print the global transform of the nodes that the cameras are attached to. This global transform describes the position and rotation (i.e. the orientation) of the camera instance.

There are many assumptions hidden in that approach. But it might be sufficient for extracting the "animation data of a camera", as a first shot.

I can try to allocate some time for improving the 'animation model' API part, to more easily and generically allow access to 'intermediate (animated) states of the scene'. But I can not make any promises about the timeline here. JglTF is a one-man, spare-time project, FWIW...

@Igrium
Copy link
Author

Igrium commented Dec 5, 2022

Thanks! This will probably work (I'm coding in a volatile environment, so I'm not too concerned with stability for now). In a previous issue, I had mentioned the animation manager and was trying to figure out how to retrieve an animation without relying on system time. It's good to know that I'm not an idiot and this is an actual limitation with the software.

Is it possible to get the underlying frame rate of the animation btw? Because the system I'm importing into does its own keyframe-based interpolation, it would be better maintain precision by importing keyframes at the rate used in the original file.

@javagl
Copy link
Owner

javagl commented Dec 5, 2022

I had mentioned the animation manager and was trying to figure out how to retrieve an animation without relying on system time.

Yes, some questions or aspects here are also related to the issue around #71 (comment) . I mentioned that there is no "simple, official" way to select a certain animation time. In fact, some aspects of the animation playback in glTF are far trickier than they look at the first glance....

Is it possible to get the underlying frame rate of the animation btw?

That's one of these aspects 😁 : There is no "frame rate". glTF does not really dictate many aspects of the runtime behavior. It is, roughly speaking, a "representation of a truth" - plain data, and clients can do with this whatever they want. Specifically: The animation maps an "input time" to "output values". The input time is given in seconds. When the animation has a duration of 1 second, and you have a fast PC that renders with 60FPS, then you'll see 60 intermediate steps. When you have a slow PC that only renders with 10FPS, then you'll only see 10 intermediate steps.

In the drafted snippet, this can be seen at

        for (int s = 0; s < 10; s++)
        {
            // Perform an animation step
            long msStep = 10;
            long nsStep = msStep * 1000 * 1000;
            callPerformStep(animationManager, nsStep);
...

It just shows the state at simulation times of 10ms, 20ms, 30ms... You could reduce the step size to msStep = 1 to have many more frames, or msStep = 100 to have fewer frames (and ... you'll have to figure out how long the animation takes in the first place...)

This, in turn, raises many questions about the target application that you are reading the data for.

(EDIT: Looking at the other thread again: Is this still Minecraft...?)

It will almost certainly support something like "key frames" as well (and probably different interpolation types, like LINEAR or SPLINE). So depending on your exact goal, there may be ways to extract the required data in a far more compact form. For example, when you have a simple, plain animation in glTF that just moves the camera from (0,0,0) to (0,0,1) in 1 second, then you could export this with 1000 steps (using the program above). This is clumsy, inefficient, generates lots of data, and is hard to handle. If you know your target application, and dive deeper into the data, then it might be possible to export this for your target application in a compact form like interpolate((0,0,0), (0,0,1), LINEAR).

(Again: There are many assuptions and guesses involved here...)

@Igrium
Copy link
Author

Igrium commented Dec 5, 2022

There is no "frame rate". glTF does not really dictate many aspects of the runtime behavior.

Interesting. I didn't know this!

Yeah, it's still Minecraft, but a slightly different use-case. I'm trying to mod the the Replay Mod be able to import camera animations from external programs rather than restricting the user to its in-built editor.

@Igrium
Copy link
Author

Igrium commented Dec 6, 2022

Also, I'm really impressed with your work in this repo. When I first found it, I thought it was maintained by a whole group of people.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants