You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, awesome work! I had a question with regards to the distillation algorithm for TCD (Algorithm 1/2 in the paper, particularly w.r.t. Eq. 21). In the original LCM paper (to my understanding), the skipping step $k$ denotes the size of the single-step ODE solve used by the teacher model to solve from $t_{n+k}$ to $t_n$ (e.g. solving from 950 to 930 using a single step that is sized $k=20$). However, in the paragraph before equation 21 it is noted that $\Phi^{k}$ denotes $k$ "discretization steps" of a one-step ODE solver. Thus, my question is: do you use multiple calls of the ODE solver (with the teacher model) to solve to some timestep between $t_{n+k}$ and $t_m$ (e.g. solving the integral with 2 single-step ODE solves thus two calls to the teacher model), or are you still only using a single ODE solver call across that interval (similar to LCM)? If so, how many? Thank you!
The text was updated successfully, but these errors were encountered:
Hi, sorry for the late reply. Here we are using single call for $\Delta k$ . We've tried multiple settings for the choice of $k$. Finally we found $k=20$ or $50$ would be better.
Hi, awesome work! I had a question with regards to the distillation algorithm for TCD (Algorithm 1/2 in the paper, particularly w.r.t. Eq. 21). In the original LCM paper (to my understanding), the skipping step$k$ denotes the size of the single-step ODE solve used by the teacher model to solve from $t_{n+k}$ to $t_n$ (e.g. solving from 950 to 930 using a single step that is sized $k=20$ ). However, in the paragraph before equation 21 it is noted that $\Phi^{k}$ denotes $k$ "discretization steps" of a one-step ODE solver. Thus, my question is: do you use multiple calls of the ODE solver (with the teacher model) to solve to some timestep between $t_{n+k}$ and $t_m$ (e.g. solving the integral with 2 single-step ODE solves thus two calls to the teacher model), or are you still only using a single ODE solver call across that interval (similar to LCM)? If so, how many? Thank you!
The text was updated successfully, but these errors were encountered: