Java-Gaming.org    
Featured games (81)
games approved by the League of Dukes
Games in Showcase (498)
Games in Android Showcase (117)
games submitted by our members
Games in WIP (564)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  LWJGL: Random direction thoughts  (Read 4181 times)
0 Members and 1 Guest are viewing this topic.
Offline Roquen
« Posted 2013-11-25 16:30:17 »

Rolling out a major version release is always an opportunity to rethink everything.  Here's a start of some random thoughts I have WRT LWJGL.

Some term definitions:
java: any JVM language
developer: a LWGL internals developer
user: programmer using LWJGL & java (and generally not doing any custom native side coding)

Choose a baseline compatibly version and nuke everything older then that.

Goals: reduce long-time developer time commitment and reduce noise presented to users.

Arguments against:
1) It'll break my program.
A) No it won't.  Version 2.x will still available.  Upgrade or don't.  Choosing a library isn't some e-peen achievement system.
2) I can't scale my program as well to different targets.
A) The statistically zero percent of people where this is a concern and can actually do so will have zero problem supporting both and loading the appropriate version at runtime.
3) What about folks that want to use the old-school API as a learning tool?
A) Version 2.x is possible, but a better solution is to use a mid to high level API instead.  Programming to a graphics API basically older than the average JGO member isn't a fantastic idea.

The thing to remember is that you don't need to think about today..nor even when V3 is considered usable, but when you guess that more users are writing against LWJGL 3.x instead of 2.x.  Personally 3.0 is probably the furthest back in history I'd think about.

Lose functionality writable in java
No reason to be in the main repo unless they are considered important to base functionality (aka support for using the low-level...but not things like look at matricies) or used internally.  One or more companion libraries (hosted by LWJGL or 3-party) should cover this functionality.

I have a fair number of other ideas, but spreading them out across time probably is better than one meta-post.
Offline princec

JGO Kernel


Medals: 380
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #1 - Posted 2013-11-25 16:57:37 »

I think that's basically the entire plan of LWJGL in a nutshell, there.

Cas Smiley

Offline jmguillemette
« Reply #2 - Posted 2013-11-25 17:22:09 »

Is there any compatibility issue to be concerned with?

Ie.. if i want to support both Mac and Windows will i be able to use the same LWJGL version ?


(May not be an issue but thought i would ask)


-=Like a post.. give the author a medal!=-
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Roquen
« Reply #3 - Posted 2013-11-26 11:04:54 »

Windows, mac and linux all with the same version = yes.

@Cas: Well then:  that's awesome.  I fully endorse the basic plan.

Next batch are along the same lines, but a step away from basic GL/AL/CL support and on to some useful things for "user" in "java" (say defs as above) + LWJGL.

Expose some hardware information

1) Number of physical processors.  Being able to know both the number of logical and physical processor is useful (aka more or less required).  Example for determining a number of threads to have active per timeslice.

2) A limited number of hardware supported CPU opcode queries.  From a user perspective the only ones worth knowing about are those that there is both: a java equivalent method call AND that method call is a JVM intrinsic.  If both of these aren't true then there's nothing in java that the user can do with the information.  This list is currently very small (This is base on an early version of 7...I need to recheck the intrinsic list at some point. This old list is here):

  leading zero count, trailing zero count, population count and byteswap.

Trivial example:

1  
2  
3  
4  
5  
6  
7  
8  
9  
isPowerOfTwo(int x)
{
  // Compiler will drop the dead code
 if (CPUInfo.hasBitCount)
    return Integer.bitCount(x) == 1;   // two opcodes if support...many if not.

  // isolate low bit, if same as input and not zero then true
 return (x & -x) == x && x != 0;      
}


Counter examples include:  SIMD operations: not exposed in java so no point.  Atomic increment: there are method calls but they are (currently) not intrinsic.


Some others might be of interest to developers since they could use that information and they could potentially expose a method via JNI.

3) Size of caches and the lengths of cache-lines.  Since java is runtime compiled this would allow building cache-aware data-structures.


(EDIT: my example was checking for LZC where it should have been checking POPCOUNT...doh!)
Offline kappa
« League of Dukes »

JGO Kernel


Medals: 77
Projects: 15


★★★★★


« Reply #4 - Posted 2013-11-26 11:58:31 »

1) is already doable with Java using Runtime.getRuntime().availableProcessors();

Not sure how useful 2) & 3) would actually be and seem a little out of scope of what LWJGL does and better suited to be part of the Java Runtime, even so its rare that such information will be needed by the users.

If you really do need that sort of fine grained information and want to optimise for certain processors you can use something like the following:

1  
2  
3  
4  
System.getenv("PROCESSOR_IDENTIFIER");
System.getenv("PROCESSOR_ARCHITECTURE");
System.getenv("PROCESSOR_ARCHITEW6432");
System.getenv("NUMBER_OF_PROCESSORS");


To identify the processor, should be easy from there to match up its spec and then optimise accordingly.
Offline theagentd
« Reply #5 - Posted 2013-11-26 12:46:29 »

1) is already doable with Java using Runtime.getRuntime().availableProcessors();
The problem with availableProcessors() is that it doesn't differ between physical and logical cores. Personally I don't think that's a problem, but it could be useful to know how many physical cores the computer has.

Myomyomyo.
Offline Roquen
« Reply #6 - Posted 2013-11-26 13:22:53 »

Yes: availableProcessors is the number of logical.  Neither piece of information is very usable on its own.  If you assume logical == physical and that isn't the case then you're likely to spawn too many and pay the very high cost of context switches over and over.  If you assume physical == logical*magic_factor and logical == physical then you're under utilizing.

Doing 2 & 3 in pure java possibly doable (depending on environment variables seems very fragile and non-portable) but is a real PITA as the code needs to be kept up to date with processors (I'm not even sure that you can get exact results with this information).  On the CPU side this isn't the case as the CPUID queries are fixed...one time development cost and you're done forever.  If these 'kinds' of queries were deemed reasonable for LWJGL to expose then it could make sense to use an overkill library (again one-time cost assuming the library is maintained) and to expose more features as deemed appropriate.  Say thread affinities.

This really isn't very much work.  In my option it is in the spirit of LWJGL's goals of provide user working in java the low-level functionality needed to write multimedia apps without resorting to native code.
Offline delt0r

JGO Knight


Medals: 27
Exp: 18 years


Computers can do that?


« Reply #7 - Posted 2013-11-26 13:23:49 »

Can you even get that info? I mean with virtualization, hyper threading etc does the number of "physical cores" even have a simple answer?

I have no special talents. I am only passionately curious.--Albert Einstein
Offline theagentd
« Reply #8 - Posted 2013-11-26 13:28:18 »

Is there any case when you would not want to use all available logical cores? I know that hyperthreading can hurt performance in some games, but Java can take a lot of benefit from hyperthreading since it helps massively when you have lots of cache misses which is very common with Java's memory model.

Myomyomyo.
Offline delt0r

JGO Knight


Medals: 27
Exp: 18 years


Computers can do that?


« Reply #9 - Posted 2013-11-26 13:35:14 »

Well personally i like to be able to restrict how many cpu's a game is running when i have sims going. Just because i have cores does not mean i want the game/app to use them

This is however an edge case. I just wish people would always let such settings be manually overridden.

I have no special talents. I am only passionately curious.--Albert Einstein
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Roquen
« Reply #10 - Posted 2013-11-26 13:50:17 »

@delt0r: yes the information is available (I'm not sure what you're think about in terms of the virtualization feature)

@theagentd: yes you want all logical cores to have an actively running thread for all timeslices to maximize throughtput.  Knowing the number of physical allows you to make a more accurate estimation.  So if one piece of the puzzle is more important then java is currently providing the most important.  Intel claims you can get up to a 30% boost per core with HT...I've only ever measured on the 10-15% (if memory serves) range.  Having a rough idea how something scales on my N core HT box gives a pretty reasonable guess about how many I should spawn on a M core box with or without HT.
Offline Spasi
« Reply #11 - Posted 2013-11-26 14:15:30 »

I've been looking at hwloc. It has a simple C API (easily added to LWJGL), works on basically every OS and CPU out there and has great features. Among many others, you can query:

- Total physical memory.
- Number of cache levels, how big each level is, which cores share which level.
- Number of processing units per core (hyper-threading).
- Cache-line sizes and cache associativity.

It also provides a cross-platform API to pin threads/processes to CPU cores (affinity).

This might seem like too compute-oriented or for server workloads (e.g. hwloc exposes per-socket information but we'll never encounter multiple sockets on a gaming machine). I honestly have stopped treating LWJGL as a pure gaming library since we added support for OpenCL, so I really think this will be useful functionality to have. Even for games, core information can be quite helpful if secondary threads are used (for physics, sound processing, asset loading, etc). You generally don't want to mix two "compute-heavy" threads in the same hyper-threaded core, performance will suffer. Also, using a dedicated core for latency-sensitive stuff can be very beneficial.
Offline Spasi
« Reply #12 - Posted 2013-11-26 14:20:59 »

Btw, I have not been able to find a library for CPU feature detection, that is acceptably cross-platform/maintained/reliable. Only libcpuid comes close, but I'd like something for ARM CPUs as well. Anyone knows a decent alternative?
Offline delt0r

JGO Knight


Medals: 27
Exp: 18 years


Computers can do that?


« Reply #13 - Posted 2013-11-26 14:23:55 »

Quote
You generally don't want to mix two "compute-heavy" threads in the same hyper-threaded core, performance will suffer.
This has been our experience to the point where hyper threading is disabled on our clusters.

As for including this or not or other performance related feature. I can't see any harm if its easy, doesn't create unreasonable dependencies and fails somewhat gracefully.

I have no special talents. I am only passionately curious.--Albert Einstein
Offline Roquen
« Reply #14 - Posted 2013-11-26 15:46:27 »

Interesting.  That seems to point toward supporting thread-affinities as being pretty desirable. 
Offline Roquen
« Reply #15 - Posted 2013-11-27 10:08:01 »

WRT: hwloc.  My initial thought was overkill, but it seems to provide reasonable features and maintained, so if the API is nice and easy and if it is simple to integrate...why not?  The risk appears to be low.

WRT: libcpuid like library which supports Intel & ARM.  Seems like an unlikely animal...too few projects would need such a thing.  If one exists it might be harder to find vs. roll-your-own.  My brain isn't pulling up any likely places to look.

WRT: compute + HT: This is just personally curiosity...that is the functional unit(s) conflict that causes the issue?

Some other random thoughts that are of lesser interest from my perspective (and greater risk & dev cost):
1) Performance counter queries.  (pretty arch specific...but that's fine for the target audience)
2) CPU local variables (this is like thread local but per CPU instead of thread).

I should note that some of these thoughts are based on:
1) make LWJGL usable to a wider audience, which widens the pool of potential contributors.
2) perceived quality.  fancy features tend to make things more attractive to people for some strange reason.
Offline kappa
« League of Dukes »

JGO Kernel


Medals: 77
Projects: 15


★★★★★


« Reply #16 - Posted 2013-11-27 10:33:22 »

While fancy new features are nice they tend to only be useful for and used by a small niche. Therefore IMO they are probably better suited as extra extensions/utilities or as an addon library rather then part of the core.

IMO the more code and features there are the more there is that can break and needs maintaining plus it makes the library less flexible when it comes to stuff like porting to new platforms, compiling to native code, etc. Do less but do it well.

From what I can tell the vast majority of applications just use/need a robust windowing system + OpenGL (including ARB extensions) + OpenAL. The nice thing is LWJGL3 is designed to be pretty modular (unlike LWJGL2) so should be pretty easy to just build custom versions using the build files.
Offline delt0r

JGO Knight


Medals: 27
Exp: 18 years


Computers can do that?


« Reply #17 - Posted 2013-11-27 10:50:20 »

The hyperthreading thing was in the context of simulations. So fairly optimized code that is vectorized with little branching. In this case HT was over slower than none. Since just about all code that is run on the High performance clusters is like that, most have it disabled. We should note it wasn't a huge difference, 10-20% or thereabouts.

This is not the case for the "general purpose" clusters. ie where a lot of python scripts and database intensive jobs are run.

For games and desktop apps i have no idea what would work out faster. I would suspect HT to win slightly or to be a tie since there should be more waiting around for most processes.

I have no special talents. I am only passionately curious.--Albert Einstein
Offline Roquen
« Reply #18 - Posted 2013-11-27 11:28:58 »

I'm not seeing what causes the problem.  You have 2 threads:  M (for main) and H (for hyperthreaded).  M can only issue and retire a fixed number of ops per clock and (as I understand...unless there's some balancing scheme of which I'm unaware) should always run at full speed.  H jumps in an executes some ops when M isn't using the execution-unit/resource in question...so I can see how H might be starved but not how total throughput is reduced.

Don't anyone spend a moment on this...just wonder if anyone knows of the top of their head.
Offline theagentd
« Reply #19 - Posted 2013-11-27 11:32:39 »

Intel claims you can get up to a 30% boost per core with HT...I've only ever measured on the 10-15% (if memory serves) range.  Having a rough idea how something scales on my N core HT box gives a pretty reasonable guess about how many I should spawn on a M core box with or without HT.
I've seen performance boosts of 20-30% percent in real code, although that code wasn't very optimized in my case. On my i7, I got 4.9x scaling using 8 threads. Possible, but unlikely with well written code. Just my 2 cents.

Myomyomyo.
Offline Roquen
« Reply #20 - Posted 2013-11-27 15:11:52 »

@kappa:  I missed your post earlier.  It may not sound like it but we're pretty much on the same page..."worse-is-better".  Focus on initial core functionality, any required support and avoid feature-creep-itis.

I'm just tossing out ideas and seeing what people think.  Is feature-X potentially core, utility, 3rd party or WTF?  If core does it make sense to do it now (low risk & dev-time...and not too many of these to keep from getting sick)? 

Example:  I think that "CPU local" is WTF...but it's a feature I can't imagine ever using.  On the flip side knowing if those 4 intrinsic methods are backed by an opcode is a potential for now (as popcount IS used in a support function).  Of course LWJGL direct usage of that call will never amount to anything but if someone were to use it for variable length coding, in a codec, etc...then the picture changes quite a bit.  Low risk, dev-time and they can't do themselves in java.

Even thought most of these features are "niche"...it doesn't really matter.  The target is really library writers (including higher level APIs) and advanced users.

On building custom versions:  I'm ignoring people willing to go there and/or write direct native code.  All of this is pretty much moot for people both able and willing to do so.
Offline Roquen
« Reply #21 - Posted 2013-12-18 15:38:50 »

Nothing to do with version 3....what's going on with code like this (which is all over the place):

1  
2  
3  
4  
5  
6  
7  
8  
/**
 * @return Number of buttons on this mouse
 */

public static int getButtonCount() {
  synchronized (OpenGLPackageAccess.global_lock) {
    return buttonCount;
  }
}


Offline Spasi
« Reply #22 - Posted 2013-12-18 16:33:14 »

It's 7 year old code related to AWT interop. Most horrible stuff in the LWJGL codebase are related to it.
Offline Roquen
« Reply #23 - Posted 2013-12-18 16:47:46 »

Is the plan for 3.x to yank this kind of stuff?  (I've vote for yes)
Offline Spasi
« Reply #24 - Posted 2013-12-18 17:02:11 »

Absolutely. Thread safety responsibility is being moved from the library to the user, where appropriate (which is virtually everywhere). There's no plan to have AWT interop in LWJGL 3 and any other kind of interop (JavaFX, SWT, etc) will be strictly "non-core"/optional functionality.
Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

Grunnt (22 views)
2014-09-23 14:38:19

radar3301 (14 views)
2014-09-21 23:33:17

BurntPizza (31 views)
2014-09-21 02:42:18

BurntPizza (22 views)
2014-09-21 01:30:30

moogie (20 views)
2014-09-21 00:26:15

UprightPath (30 views)
2014-09-20 20:14:06

BurntPizza (34 views)
2014-09-19 03:14:18

Dwinin (48 views)
2014-09-12 09:08:26

Norakomi (75 views)
2014-09-10 13:57:51

TehJavaDev (108 views)
2014-09-10 06:39:09
List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50

List of Learning Resources
by SilverTiger
2014-07-31 16:26:06

List of Learning Resources
by SilverTiger
2014-07-31 11:54:12

HotSpot Options
by dleskov
2014-07-08 01:59:08
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!