Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Android-java target batching is very slow #1051

Closed
RblSb opened this issue May 21, 2019 · 4 comments
Closed

Android-java target batching is very slow #1051

RblSb opened this issue May 21, 2019 · 4 comments

Comments

@RblSb
Copy link
Contributor

RblSb commented May 21, 2019

Because of vertex array pushing every frame. I try to optimize Float32Array implementation in #1041 replacing FloatBuffer to NativeArray<Single>, but it is still bad, so there is two ways:

  • Use sun.misc.Unsafe with reflection to put floats in FloatBuffer (LWJGL main way)
  • Use JNI for same reason (Libgdx way)

Also, maybe haxe jvm supports jvm-pasting and we can do something on jvm for that.
Some articles:
https://github.com/LWJGL/lwjgl3-wiki/wiki/1.3.-Memory-FAQ
https://www.badlogicgames.com/wordpress/?p=904

Not sure if Unsafe class available on all androids (with reflection), but it's still requres some code extraction from LWJGL/MemoryStack.java and MemoryUtil.java.
JNI should be easier to implement (if it has way to push floats in FloatBuffer), but it's requres some c/cpp compiler integration. Also not sure about Java native method call overhead.

@RobDangerous
Copy link
Member

Sorry but putting a large array on the stack makes no sense, at best it does not result in a difference, at worst - stack overflow. Allocating things from the stack is faster than from the heap but there's no speed difference in using things from the stack or the heap.

@RblSb
Copy link
Contributor Author

RblSb commented May 21, 2019

Okay, so the problem is with putting floats in FloatBuffer, FloatBuffer.put methods is just too bad. It is still requres Unsafe or JNI to make it fast. Will update issue.

@RblSb
Copy link
Contributor Author

RblSb commented Sep 12, 2019

I checked JNI approach with direct bytebuffer and memcpy, but its obviously bad because of jni overhead every get/set call. Also i prepare vertices to have backed colors/texcords and every drawImage only changes rect cords, but this only helps to get 1.5x speedup, so there is other hotspots in rendering and i'm too lazy to install emulator for profiling, doesn't make sense anyway. Still interested how libgdx java backend works with bunnymark, if they have same immediate mode rendering (seems almost impossible in java for me now).
Fun fact: with ndk setup gradle build slowdowns from 5 to 30s with one C file.

Spoiler with silly things
#include "test.h"
#include <stdlib.h>
#include <string.h>

JNIEXPORT
void JNICALL Java_jni_Test_copy(JNIEnv *env, jclass clazz, jobject dst, jfloatArray src, jint offset, jint len) {
	unsigned char* pDst = (unsigned char*)(*env)->GetDirectBufferAddress(env, dst);
	float* pSrc = (float*)(*env)->GetPrimitiveArrayCritical(env, src, 0);
	memcpy(pDst, pSrc + offset, len * 4);
	(*env)->ReleasePrimitiveArrayCritical(env, src, pSrc, 0);
}

JNIEXPORT
void JNICALL Java_jni_Test_set(JNIEnv *env, jclass clazz, jobject dst, jint index, jfloat value) {
	unsigned char* pDst = (unsigned char*)(*env)->GetDirectBufferAddress(env, dst);
	memcpy(pDst + index, &value, 4);
}

JNIEXPORT
jfloat JNICALL Java_jni_Test_get(JNIEnv *env, jclass clazz, jobject dst, jint index) {
	int *iBuf = (*env)->GetDirectBufferAddress(env, dst);
	float value;
	memcpy(&value, iBuf + index, 4);
	return value;
}
package jni;

@:classCode('
	static {
		System.loadLibrary("kore");
	}
')
class Test {
	@:native public static function copy(
		to: java.nio.ByteBuffer, from: java.NativeArray<Single>,
		offset: Int, size: Int
	): Void;
	@:native public static function get(to: java.nio.FloatBuffer, i: Int): Single;
	@:native public static function set(to: java.nio.FloatBuffer, i: Int, value: Single): Void;
}

build.gradle requres only two externalNativeBuild blocks and optional ndk {abiFilters 'armeabi-v7a'}, CMakeLists need only add_library(kore SHARED "path/files.c" "path/files.h")

@RobDangerous
Copy link
Member

Looks like this is a won't-fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants