Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About implementing functions from tf and tfKAN interleaved #8

Open
luongKhaiChuong opened this issue Jul 4, 2024 · 2 comments
Open

Comments

@luongKhaiChuong
Copy link

Thank you for the library. I am curious if we can use both the libraries to write functions in the same model (say, many FC layers with the output layer is a DenseKAN). If this is not the case, would you mind suggesting me an alternation?
Thanks

@ZPZhou-lab
Copy link
Owner

In my opinion, the FC layer with activation function (tf.keras.layers.Dense()) and the KAN layer (tfkan.layers.DenseKAN()) enhance the non-linear expression ability of the model through two different approaches. The former mainly involves matrix multiplication, while the latter transfers the calculation to the spline function, only requiring adjustment of the spline coefficients.

Therefore, using them is like using different bases to represent the target you want to fit. For example, for any function f. If you use polynomial space as the basis (i.e. 1, x, x^2, x^3, ...) for expression, you will get the classical Taylor expansion expression of f:

  • f(x) = \sum_{n=0}^{\infty} (n!)^{-1} f^{(n)}(0) * x^n

and if you use trigonometric function space as the basis (i.e. 1, sin(x), cos(x), sin(2x), cos(2x), ...) for expression, you will get the Fourier expansion expression of f:

  • f(x) = a_0 + \sum_{n=1}^{\infty} (a_n * cos(nx) + b_n * sin(nx))

The classical MLP layer Dense() and KAN layer DenseKAN() are like using two different bases to fit the functional relationships in the data. KAN shifts the complexity of the model to the calculation of spline so as to enhancing the smoothness and flexibility of KAN's bases. Therefore, in the introduction and experiments of KAN authors, KAN may only require fewer model parameters to achieve good performance.

Here are some ideas about constructing a model mixed tf and tfkan:

  • If your task does not require a large model and you want to obtain a good explanation of the network connection structure (which is also the main focus of the KAN author), then constructing a smaller scale shallow DenseKAN is a suitable choice (in the author's introduction, such networks are mostly within 3 layers)
  • If your task requires a large model to achieve good approximation ability, in my observation, DenseKAN() will not have a greater performance advantage than FC Dense(). The deep MLP using residual connections in the architecture of the model now also has excellent non-linear expression ability. In addition, It should be noted that for large-scale neural networks, the training and inference efficiency of DenseKAN() has not been optimized yet

Hope this can help u~ 🤗

@luongKhaiChuong luongKhaiChuong closed this as not planned Won't fix, can't repro, duplicate, stale Jul 5, 2024
@luongKhaiChuong
Copy link
Author

I am so sorry I was confused about the use of the buttons. Thanks for answering my question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants