babellog

Sunday, August 22, 2004

Java & Zero base

I was working on this mediation stuff, where I get a binary file in EBCDIC
format (yes it’s still out there) which contains the call data records, and I was suppose to do some mapping. It became interesting when the Call End Date needs to be calculated, by using the Call Start Date and Call Duration.

So I resorted to GregorianCalendar class in Java where I could add seconds to existing date. So I just passed all the information I had to the constructor to create the date object.

GregorianCalendar(int year, int month, int date, int hour, int minute, int second)

When I looked at the output file even though I was handling a file generated in April, the date was May. Then of course when I looked at the API, it had

month - the value used to set the MONTH time field in the calendar. Month value is 0-based. e.g., 0 for January.

I don’t think programmatically it would have been very difficult for anyone who designed the API to make the months 1 based instead of zero based, but it looks like having zero based months comes from C language time.h library days.

Here is an interesting read on Java & Zero base :A foolish consistency

27 Comments:

  • At 8:54 PM, Blogger longdeparted said…

    I completely agree with you on this one. But Java's Calendar and Date handling functions have always been inconsistent and slightly harder to use than normal.

    For instance: WEEK_OF_MONTH is a value between 1 and 6. WEEK_OF_YEAR, on the other hand, is a value between 1 and 53. MONTH indexing starts from 0, as you point out. In fact, it seems like the whole set of classes were written by a Dr. Jekyll, Mr. Hyde personality ;)

    Just use JODD for the heavy lifting.

     
  • At 3:20 PM, Blogger gumz said…

    so ditto!

     
  • At 3:52 PM, Blogger gumz said…

    Going off on a tangent, what was the real need for 0 base? I'm guessing the origins would have been code generation using indexed addressing, relative addressing etc...(straightforwardish code gen that is) It would have done away with the math jugglery, when initialising index registers and calculating relative addresses. what do you guys think? It seems like a very poor reason.

    Or is it an elegence issue?
    zero base.
    int foo[MAX];
    ...
    for(int i=0; i<MAX; i++)
    ...
    vs.

    one base
    ...
    for(int i=1; i<=MAX; i++)
    ...
    again a very trivial issue. thoughts?

     
  • At 8:58 PM, Blogger edg said…

    Well I don't know java and at the risk of making a total ass of myself here goes, is it possible that the structures play a role.

    For the calendar, maybe there is an underling structure, an array perhaps which stores the month names.

    But the other instances Thimal had mentioned might not necessitate such a structure. Therefore it is staring from 1.

     
  • At 9:14 PM, Blogger longdeparted said…

    Hmm, Ed, I'd like to think it's that way, but shouldn't the interface be universally consistent ? Wouldn't you, as a language (and API) designer strive for consistency at the expense of mirroring the underlying implementation details ?

    Gumz, fascinating question. I realized that I didn't know the origin of zero-based indexing myself, so I googled around a bit and found this. Particularly interesting is the PDF link at the bottom , written by Djikstra, which seems to give a pretty rational explanation.

     
  • At 12:21 PM, Blogger gumz said…

    that's one brilliant discourse, Djikstra always was one for putting things down elegently - brilliant.

    But, I dunno - i feel i have a bone to pick here. I believe the 'cosmetics' argument i gave grazes Djikstra's point.

    for(i=1; i<Max+1;i++)

    I'm all for mathematical beauty and all that jazz. But what is important is that it has to be as natural as possible for the end user, in this case the programmer. Even at the cost of non-elegance evident in the LOC above.

    But there is the Xerox PARC eg. to contend with. But that was in a very mathematical background, wasn't it. So that brings me to code generation, code generation involves mapping subscripts to memeory arddresess, and this job is a bit more straightforward with zero base, but the question is do we have to pass on this implemenation detail to the code-churners. We have to concede that most code churners are not really mathmatical.

    It makes sense to have zero-base in C, 'cause it's used to write compilers. But is it really neccessary in Java, which is pretty much an app prog langauage rather than systems prog language?

    Personally, I more comfy with zero-base, but i think that can be attributed to C. The question is what is good for the app programmer? 0-base or 1-base? Any particular situation, say in java programming, where 0-base makes more sense?

     
  • At 12:36 PM, Blogger gumz said…

    oh good god....

    I do NOT say that C is only for compiler writers - C is for everyone!!!!! And I do NOT think app programers are inferior beings! And I do NOT think java is second to C - it's seconf to none. it's kwl. I hope that covers the lot. phew.

    I'm mearly questioning the need for zero-base...

    Why? well thimal was kind enough to nail an origin of sorts so I have to think of something else to argure about haven't i?

    Thank god the resident grammer nazi is still under the influence of liquorice

     
  • At 4:08 PM, Blogger longdeparted said…

    Hmm, well, I personally think it doesn't matter, so long as there is consistency. (0-base, 1-base, 10-base, whatever).

    Also: it's probably unwise to consign a particular language to a niche function (C => compilers) because I have worked with a Java based grammar generator this year that is scarily good, just when I had thought that Java would never handle strings as well as others (Perl or even the lexx/yacc combo).

    I sort of like the PHP method, which does not distinguish between arrays and hashes. Therefore, you can use either numeric indices or string based indices interchangably.

     
  • At 4:43 PM, Blogger 88Pro said…

    I remember even pascal supporting indices based on alphabats. But then again its still not same as String based indices I guess.

    intArray = array[A..Z] of integer; (not 100% sure about the syntax)

    Whats the point I was trying to make? (Don't ask me :-)

     
  • At 12:31 PM, Blogger gumz said…

    So in conclusion, we are all for associative arrays.And consistency of key values is as important as consistency in sexual orientation! And Dennis Ritchie(as well as Djikstra) liked his zeros, and hence we are stuck with zero-based arrays.

     
  • At 1:35 PM, Blogger edg said…

    I think the "we" here constitute a very small minority.

    But don't you think that the argument for 0 based is far simpler than the sexual orientation of individuals?

     
  • At 2:02 PM, Blogger longdeparted said…

    Hmm, well.. having said that, the most immediate problem is that arrays and hashes do not perform the same for some operations, most notably traversals (hashes are not guaranteed to emit keys in the same order), so for iterations, a potentially expensive sort is also required.

    Other than that, yes, hashes are cool :)

     
  • At 3:16 PM, Blogger edg said…

    If we consider binary numbers, 8 bits would represent 0 - 255, not 1-256. If that were the case we'd need 9 bits. So then itsn't it logical for the index to go from 0 to 255 rather than 1 to 256?

     
  • At 3:29 PM, Blogger gumz said…

    yah well, can't expect everything from hashes. I don't get it ed... you can still map 1 to 256 to 8 bits can't you? albeit with a bit of jugglary...

     
  • At 3:33 PM, Blogger gumz said…

    Java based grammer generator?

     
  • At 3:35 PM, Blogger edg said…

    Why do all that jugglary? I just read part of this but the little I read gives me the idea that the amount of resources that these guys had to work with was quite limited, no monster machines with 512Mb ram.

    Based on that don't you think the jugglary would NOT be justified.

     
  • At 3:41 PM, Blogger 88Pro said…

    >>Java based grammer generator?

    May be he was talking about javacc

    https://javacc.dev.java.net/

    Senthoor

     
  • At 4:01 PM, Blogger gumz said…

    yeah, no to jugglary it is. BSD license, can we get the source for this?

     
  • At 4:26 PM, Blogger gumz said…

    Thx Senthoor, I was so totally unaware of this. Just got th source, ignore previous dumb question. Thimal any particlar reason you went for a java based, rather than good 'ol lex/yacc.

     
  • At 9:41 PM, Blogger longdeparted said…

    because the rest of this was written in Java :)

    By the way, with some experience in the matter now, I am rather torn between JavaCC and Antlr.

     
  • At 10:36 PM, Blogger edg said…

    FreeBSD , although I know many Linux junkies, I have never met a BSD junky.

     
  • At 10:46 PM, Blogger longdeparted said…

    reluctantly raises his hand. Not recently though, I ran a couple of servers with 4.2

    In some ways, it's more consistent than Linux, but hardware support is --: remainder of sentence censored : -- At least Linux gets some decent driver support (some of the time)

     
  • At 2:41 AM, Blogger edg said…

    hmmm, I can remember the time I tried installing Red Hat Linux 4, way back in 1996~7 (i just wanted to see what it was).

    The manual said that if a driver is not available I should write it and if I cannot write it, I should not own a PC.

    Linux has come a long way from that I guess. While Linux has gone commercial has BSD remained strong in academia?

     
  • At 10:49 AM, Blogger gumz said…

    Colombo Uni ran a BSD server, dunno if its still around.
    Since we are on the subject of parser generators,Any ideas on coversions of this type,

    java bytecode -> x86 assembly
    php bytecode -> x86 assembly
    java bytecode -> parrot

    basically, translation from one target machine to another.

    The only ways i can think of is,
    1. Hand code
    2. Use Parser for pattern matching

    Does anyone know of tools/techniques/resources targeted at this type of problem? Apart from Parsers that is.

     
  • At 2:36 PM, Blogger longdeparted said…

    Ed, not too much in academia either. I ran BSD out of curiosity, initially (because I heard it was rock solid) and I must admit I picked FreeBSD because I heard Yahoo was running it (true, dat). It's just a piece of anecdotal evidence, but I thought my cluster of BSD-Squids performed quite a bit better under heavy load than my competing cluster of Linux-Squids.

    This was though I didn't know anything much about BSD and did a default install, while I had twiddled and tweaked my RedHat for ages. (and I think some of those tweaks were good ones, even *grin*)

     
  • At 2:42 PM, Blogger longdeparted said…

    Gumz, just a suggestion that I can't back up yet (will try and post a followup later, but I have a meeting in 20 minutes [on the other side of campus])

    Perl2Exe works by wrapping an interpreter instance around code and running the script as an executable. Perhaps you should take a look at the following for ideas:

    Psycho (very very nice executable compiler for Python)
    Perl2Exe and PerlApp (for Perl)
    Also: some encoder mechanisms for PHP like Turck or even the venerable Zend (but I am not sure if Zend is open source) to see how they pick apart the bytecode

    I operated under the assumption that this is for your Frankenstein..err, no Stella Artois err no.. "you know what I mean" project :)

     
  • At 5:53 PM, Blogger gumz said…

    Stella Artois? :) um ahem yah, something like that, I can't do much on it, see; it's not my project thingy. 'you know who' is dragging her feet over it see.

    Firstly, the execution model I'm interested in is,
    (some intermediate code generator -> Some bytecode -> Translator -> machine specific assembly

    Once you have the machine specific assembly you can use a standard Assembler and a linker of choice on it.

    I know, I know, this looks very micro softish.... but hey i believe it's worth looking at.

    What I'm really interested in, is a generic Translator that will include Source Language scheme + Target Language schema + translation scheme + register allocation/assingnment strategy. umm if that makes any sense ( i tend to be rather incoherent :(.

    So, I'm not intererested in execution of scripts in any 'ol way. There is a hack for the execution of PHP scripts in the same manner as PERL2Exe. It's neat and all, but it's a sneaky way of doing things.

    And Turk, just takes the PHP bytcode generated by zend and caches it so that the script -> php bytecode step is skipped when a script is used.

    Zend, does not generate any form of machine code, it just produces bytecode and shuttles it off to zend_execute which performs the VM function. And yes it's open source.

    Yes, psycho is more like what i want, but it's very very hacky. This PERL script to translate javabytecode to parrot bytecode, gives some good insight.

     

Post a Comment

<< Home