Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java swig wraper: how to use LGBM_DatasetCreateFromFile to load data from file in Java #1820

Closed
lankuohsing opened this issue Nov 5, 2018 · 3 comments

Comments

@lankuohsing
Copy link

lankuohsing commented Nov 5, 2018

I

Environment info

Operating System:
Windows 10
CPU/GPU model:
CPU
C++/Python/R version:

Error message

D:\Projects\eclipse\lightgbm-2.2.1\rank.train
com.microsoft.ml.lightgbm.SWIGTYPE_p_void
[LightGBM] [Info] Loading query boundaries...

A fatal error has been detected by the Java Runtime Environment:

EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ff838689973, pid=21144, tid=0x00000000000024a0

JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 1.8.0_131-b11)
Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode windows-amd64 compressed oops)
Problematic frame:
C [lib_lightgbm.dll+0x9973]

Failed to write core dump. Minidumps are not enabled by default on client versions of Windows

An error report file with more information is saved as:
D:\Projects\eclipse\lightgbm-2.2.1\hs_err_pid21144.log

If you would like to submit a bug report, please visit:
http://bugreport.java.com/bugreport/crash.jsp
The crash happened outside the Java Virtual Machine in native code.
See problematic frame for where to report the bug.

Reproducible examples

I have generated a JAR file containing the LightGBM C API wrapped by SWIG. When I want to load training data in Java using LGBM_DatasetCreateFromFile(String jarg1, String jarg2, long jarg3, long jarg4) defined in lightgbmlibJNI.calss,I don't know how to pass arguments jarg3 and jarg4.so I put 2 long variables,and it went wrong as above.
My Java codes are as follows:

package test1;

import java.util.ArrayList;

import com.microsoft.ml.lightgbm.SWIGTYPE_p_p_void;
import com.microsoft.ml.lightgbm.SWIGTYPE_p_void;
import com.microsoft.ml.lightgbm.SWIGTYPE_p_long;
import com.microsoft.ml.lightgbm.lightgbmlib;
import com.microsoft.ml.lightgbm.lightgbmlibJNI;

public class Test1 {
	
	lightgbmlib lightgbmlib1=new lightgbmlib();
	static{
		System.loadLibrary("lib_lightgbm");
		System.loadLibrary("lib_lightgbm_swig");
	}
	public native int LGBM_DatasetCreateFromFile(String jarg1, String jarg2, long jarg3, long jarg4);
	
	public static void main(String[] args) {
		Test1 test1 = new Test1();
		
//		
		
		String data_path="D:\\Projects\\eclipse\\lightgbm-2.2.1\\rank.train";
		String data_path1="D:\\Projects\\eclipse\\lightgbm-2.2.1\\out.txt";
		String config_path="D:\\Projects\\eclipse\\lightgbm-2.2.1\\train.conf";
		
		String parameter="task=train"+
				" boosting_type=gbdt"+
				" objective=lambdarank"+
				" metric=ndcg"+
				" ndcg_eval_at=1,3,5"+
				" metric_freq=1"+
				" is_training_metric=true"+
				" max_bin=255"+
				" data=rank.train"+
				" valid_data=rank.test"+
				" num_trees=100"+
				" learning_rate=0.1"+
				" num_leaves=31"+
				" tree_learner=serial"+
				" feature_fraction=1.0"+
				" bagging_freq=1"+
				" bagging_fraction=0.9"+
				" min_data_in_leaf=50"+
				" min_sum_hessian_in_leaf=5.0"+
				" is_enable_sparse=true"+
				" use_two_round_loading=false"+
				" is_save_binary_file=false"+
				" output_model=LightGBM_model.txt"+
				" num_machines=1"+
				" local_listen_port=12400"+
				" machine_list_file=mlist.txt";
				 
//		data_path=data_path+"\0";
		System.out.println(data_path);
		ArrayList<String> aa=new ArrayList<String>();
		String[] str2=new String[10];
		long pp=0;
		long p=0;
		String str1="";

		System.out.println(SWIGTYPE_p_void.class.getName());

		int a=lightgbmlibJNI.LGBM_DatasetCreateFromFile(data_path, parameter, p, pp);
		
	}
}

@StrikerRUS
Copy link
Collaborator

@imatiach-msft Can you please help?

@imatiach-msft
Copy link
Contributor

@lankuohsing @StrikerRUS for the last parameter, you need to allocate a handler, that is why you are getting the error.
For example, see: https://github.com/Azure/mmlspark/blob/master/src/lightgbm/src/main/scala/LightGBMUtils.scala#L312
Note that datasetOutPtr must be allocated as:
val datasetOutPtr = lightgbmlib.voidpp_handle()
https://github.com/Azure/mmlspark/blob/master/src/lightgbm/src/main/scala/LightGBMUtils.scala#L305
This is the part you are missing in the code above
BTW, we should probably have more user-friendly wrappers around the SWIG wrappers for Java users, or generate nicer SWIG bindings - eg see this comment

@StrikerRUS
Copy link
Collaborator

Thanks a lot @imatiach-msft for your answer with examples!

For

user-friendly wrappers around the SWIG wrappers for Java users

we have a separate issue, as you've already mentioned above. So, I'm closing this one.

@lock lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants