Java-Gaming.org Hi !
Featured games (83)
games approved by the League of Dukes
Games in Showcase (539)
Games in Android Showcase (132)
games submitted by our members
Games in WIP (603)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  NIO: yet another lethal stupid bug  (Read 3418 times)
0 Members and 1 Guest are viewing this topic.
Offline blahblahblahh

JGO Coder


Medals: 1


http://t-machine.org


« Posted 2005-06-05 10:25:54 »

FileChannel.transferTo( Channel ) - this method can corrupt your data in 1.4.2, probably in 1.5.x too (can't find any fixed bugs on this subject).

I just found a condition that's causing Opera to get very upset which, when traced deep down, involves a file that is being sent using FileChannel.transferTo and dropping 2 bytes, consistently, in the same place. Suspiciously, this is precisely after sending the first 16 bytes.

Eventually, I got it narrowed down to the fact that NIO is reporting "96 bytes transferred" which happens to be the length of the source file (correct) but it is in fact only sending 94 of those bytes to the channel. This is an atomic operation (just one method in NIO) and it's failing: I'm 99% sure this is nothing to do with my code - I don't even think it's possible to write code that takes stuff out of a channel that has already been put in. It's very difficult to produce a "small" testcase because it only happens in NIO file-to-network, and you need 500 lines of code or so to get a full NIO network system running to demo with.

NB: the file in question is precisely 96 bytes in length; i.e. 2^6 + 2^5, which leads me to speculate this is a bug dependent upon a precise file length.

malloc will be first against the wall when the revolution comes...
Offline blahblahblahh

JGO Coder


Medals: 1


http://t-machine.org


« Reply #1 - Posted 2005-06-05 12:19:27 »

Update: it's not just transferto that's broken, it's apparently the SocketChannel itself: even doing a filechannel.read (96 bytes) to a buffer followed by a socketchannel.write (96 bytes) actuallly only writes 94 bytes.

So...my best guess right now is that some of the bytes trigger some kind of bug in socketchannel or in linux's I/O system such that it removes them from the stream.

I've noticed that the actual bytes that go missing are all 255 == FF - commonly used for various "special" meanings within a stream, e.g. EOF.

So, in the stream we have:
 FF FF FF
and in the output we have:
 FF

Personally, my guess is that the java nio library is translating "FF FF" as "" Sad

malloc will be first against the wall when the revolution comes...
Offline blahblahblahh

JGO Coder


Medals: 1


http://t-machine.org


« Reply #2 - Posted 2005-06-05 12:20:32 »

FYI, I have logged a bug with Sun, but I can't comment on it and add the FF FF info until they reply to the initial bug report (which they blatantly aren't going to for at least 3 months going on the last few bugs I filed).

Sob.

malloc will be first against the wall when the revolution comes...
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Matzon

JGO Knight


Medals: 19
Projects: 1


I'm gonna wring your pants!


« Reply #3 - Posted 2005-06-05 12:47:50 »

so what happens if you send a binary file of 100 0xff's ?

Offline blahblahblahh

JGO Coder


Medals: 1


http://t-machine.org


« Reply #4 - Posted 2005-06-05 14:18:33 »

so what happens if you send a binary file of 100 0xff's ?

...let me get back to you on that; wokring through a battery of tests at the moment, in the vain hope it's something somewhere in my code, or the grexengine - or something I can workaround using NIO. At the moment, I'm stuck with no workaround Sad short of "edit the file and remove the FF's".

malloc will be first against the wall when the revolution comes...
Offline swpalmer

JGO Coder


Exp: 12 years


Where's the Kaboom?


« Reply #5 - Posted 2005-06-05 14:54:50 »

I see the potential link between a data stream that ends exactly with FF FF FF FF when of course the EOF condition is also reported as -1.  The reading of those last FF's corresponding to the true EOF is interesting.  But perhaps it is simply a red herring.

You mentioned you were testing on Linux, I wonder have you reproduced this on any other platforms?

You have only found this to be a problem with writing to SocketChannels is that right?

Have you confirmed that the FF data is necessary to make this happen?

Curiously enough I was using transferTo for a file copy and found it's performance to be a bit slower than expected.. I was going to just leave it as is, but this report has me a bit concerned that perhaps I should stick to the 'old way'?

Offline blahblahblahh

JGO Coder


Medals: 1


http://t-machine.org


« Reply #6 - Posted 2005-06-05 15:19:34 »

so what happens if you send a binary file of 100 0xff's ?

I ff'd the original file (96 bytes of FF), and I get:

49 bytes of FF

IMHO this is definitely a bug.

Could be NIO (most likely IMHO), or could be linux (you never know...)

malloc will be first against the wall when the revolution comes...
Offline blahblahblahh

JGO Coder


Medals: 1


http://t-machine.org


« Reply #7 - Posted 2005-06-05 15:32:23 »

You mentioned you were testing on Linux, I wonder have you reproduced this on any other platforms?

I don't have any other ones *convenient* to test upon right now, but will give it a try when I can. Would be good to know...

Quote
You have only found this to be a problem with writing to SocketChannels is that right?

Only:
   FROM filechannel
   TO socketchannel

Quote
Have you confirmed that the FF data is necessary to make this happen?

Yes. It only happens in that one file, which is a GIF which happens to have 3 consecutive FF's.

Zeroing the first FF makes the bug go away entirely (i.e. 2 FF's is fine, it's having three that causes two bytes to be lost)

Quote
Curiously enough I was using transferTo for a file copy and found it's performance to be a bit slower than expected.. I was going to just leave it as is, but this report has me a bit concerned that perhaps I should stick to the 'old way'?

Note that I have *no workaround* for this bug!

Also note that *there is no way of detecting the bug in your own code*! (because in their infinite wisdom the sun staff implementing NIO didn't provide any way to examine the state of a channel). The first time you learn of it will be when some important file has been corrupted. Not good.

I am beginning to wonder to what extent sun's NIO staff deliberately broke NIO on windows and linux. NB: what follows is tongue-in-cheek...

Bear in mind that for the first 2 years after gold release NIO didn't work on windows. Ultra basic stuff like being unable to have more than 64 clients (WTF!), and selectors that didn't select (!). LIkewise, Sun's linux implementation of NIO did ultra-stupid stuff like using the only form of nb/asynch I/O on linux that no-one ever uses because it's so crap. Plus all the show-stopping bugs that hung around for > 1.5 years after *gold* release.

Oh, yeah, and the fact that they still haven't written any documentation explaining how to use NIO.

Browsing NIO bug reports today, I noticed an alarming number with "closed, wontfix, couldnt reproduce" with multiple "I get this problem too" comments from different users over the course of a year or more. Having had a bug rejected once before by a sun engineer too lazy to investigate my bug, I know what it feels like (NB: just to be clear, I've had circa 20 bugs logged with sun engineers, most of whom were very helpful, some going way beyond what they needed to do in order to try and help fix the problem - I've only had the one particularly bad experience)

Makes you wonder, no?

malloc will be first against the wall when the revolution comes...
Offline vrm

Junior Devvie




where I should sign ?


« Reply #8 - Posted 2005-06-06 08:36:21 »

well you know Sun is slowly but surely dying .. frankly I don't think NIO is less bugged on Solaris, it's looks more like lazyness / not the priority problem. Perhaps we should take a look at Classpath NIO implementation ?
Offline Mark Thornton

Senior Devvie





« Reply #9 - Posted 2005-06-06 09:28:37 »

I am beginning to wonder to what extent sun's NIO staff deliberately broke NIO on windows and linux.
I don't think the Windows problems were deliberate, rather just a desire to have more common code than was in fact practical. In particular the need to use different code on Windows 9x/ME and NT derived systems would have been resisted.
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline blahblahblahh

JGO Coder


Medals: 1


http://t-machine.org


« Reply #10 - Posted 2005-06-06 14:17:15 »

I am beginning to wonder to what extent sun's NIO staff deliberately broke NIO on windows and linux.
I don't think the Windows problems were deliberate, rather just a desire to have more common code than was in fact practical. In particular the need to use different code on Windows 9x/ME and NT derived systems would have been resisted.

That would be fair enough, and I did say "tongue in cheek", but what possessed them not to even check that you could have more than 64 connections? (i.e. I believe they were using IOCP without working around the hardcoded limit of max number of channels.

malloc will be first against the wall when the revolution comes...
Offline Mark Thornton

Senior Devvie





« Reply #11 - Posted 2005-06-06 14:36:43 »

That would be fair enough, and I did say "tongue in cheek", but what possessed them not to even check that you could have more than 64 connections? (i.e. I believe they were using IOCP without working around the hardcoded limit of max number of channels.
The usual struggle to get something out the door. Plus there are numerous places in the API which look like familiar standard (unix) methods, but behave in subtly (or not) different ways. It must have seemed attractive to use a familiar approach instead of grappling with the very different completion ports mechanism.
Offline blahblahblahh

JGO Coder


Medals: 1


http://t-machine.org


« Reply #12 - Posted 2005-06-12 15:23:38 »

Note that I have *no workaround* for this bug!

Also note that *there is no way of detecting the bug in your own code*! (because in their infinite wisdom the sun staff implementing NIO didn't provide any way to examine the state of a channel). The first time you learn of it will be when some important file has been corrupted. Not good.

Still no workaround, still no response from Sun (yay!).

Although I can't find any specific record of having this bug in old JVM's (searched the grex bugs db), I *did* find some customer reports of specific files failing to transfer properly which then suddenly started working again and no-one could find any problems with. Possibly this bug has always been in NIO, and it's only now, with a very very small sample file, that anyone's been able to spot the problem? (the customer bugs probably were on files that got altered, losing their triple-FF, and so became "safe")

malloc will be first against the wall when the revolution comes...
Offline vrm

Junior Devvie




where I should sign ?


« Reply #13 - Posted 2005-06-13 07:05:56 »

OMG .. it's prolly the same bug I got in my file tranfert protocol   Angry I gave up on trying to fix it and use an HTTP server for simple file transfert.
Offline blahblahblahh

JGO Coder


Medals: 1


http://t-machine.org


« Reply #14 - Posted 2005-06-13 07:40:41 »

OMG .. it's prolly the same bug I got in my file tranfert protocol   Angry I gave up on trying to fix it and use an HTTP server for simple file transfert.

Please, do the world a favour Wink - generate as small a test case as you can and log it with Sun. If they get enough dupes of the same problem, they might realise how serious it is and fix it Smiley

malloc will be first against the wall when the revolution comes...
Offline vrm

Junior Devvie




where I should sign ?


« Reply #15 - Posted 2005-06-13 07:53:49 »

too much assles for now Smiley  Jetty embedded works well for now, but I will perhaps after I released
Offline vrm

Junior Devvie




where I should sign ?


« Reply #16 - Posted 2005-06-13 08:53:21 »

I made a simple stresser for my File server, and it finish by failling even with text files without  0xFFFF ins, so it's not the same bug (prolly mine)
Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

rwatson462 (33 views)
2014-12-15 09:26:44

Mr.CodeIt (23 views)
2014-12-14 19:50:38

BurntPizza (51 views)
2014-12-09 22:41:13

BurntPizza (84 views)
2014-12-08 04:46:31

JscottyBieshaar (45 views)
2014-12-05 12:39:02

SHC (59 views)
2014-12-03 16:27:13

CopyableCougar4 (59 views)
2014-11-29 21:32:03

toopeicgaming1999 (123 views)
2014-11-26 15:22:04

toopeicgaming1999 (114 views)
2014-11-26 15:20:36

toopeicgaming1999 (32 views)
2014-11-26 15:20:08
Resources for WIP games
by kpars
2014-12-18 10:26:14

Understanding relations between setOrigin, setScale and setPosition in libGdx
by mbabuskov
2014-10-09 22:35:00

Definite guide to supporting multiple device resolutions on Android (2014)
by mbabuskov
2014-10-02 22:36:02

List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!