reading exponentials

Martin Baker wrote:

> My question is: what is the most efficient way to read in large arrays of
> numbers? (such as a very big IndexedFaceSet)

> I get the impression that the most 'architecturally sound' way to do
> formatted I/O is to use the java.text package.

Not really. I only ever use this when writing data out, not reading.

> So I tend to read the file one line at a time, use StringTokenizer to split
> it up, then convert to a double floating point number using something like:
>
> value = (new Double(s)).doubleValue();

Yuck. You are better off using StreamTokeniser (in java.io) and just
splitting it based on the separator char - assuming linefeeds are not
important. Here is one method of doing it:

InputStream is = // however you open it.

double[] data = new double[some_size];

StreamTokenizer strtok = new StreamTokenizer(is);
strtok.eolIsSignificant(false);
strtok.parseNumbers();
strtok.whitespaceChars(','); // set up comma delimited if needed
strtok.commentChar('#'); // set up # as a comment char if needed

int i = 0;
int type;

while((type = strtok.nextToken()) != StringTokenizer.TT_EOF)
{
switch(type)
{
case StringTokenizer.TT_NUMBER:
data[i++] = strtok.nval;
break;

case StringTokenizer.TT_WORD:
// do something or toss it away
break;

case .....

}
}

Alternatively, there is also a static parseDouble() method for the
String. This saves on allocation.

try
{
double dbl = Double.parseDouble(s);
}
catch(NumberFormatException nfe)
{
}

--
Justin Couch Author, Java Hacker
Snr Software Engineer couch@ccis.adisys.com.au
ADI Ltd, Systems Group http://www.vlc.com.au/~justin/
Java3D FAQ: http://tintoy.ncsa.uiuc.edu/~srp/java3d/faq.html

----------------------------------------------------------------------

The other way around this problem using the existing StreamTokenizer class
is to return the raw token as text and perform your own conversion of the
token rather than letting its internal routines do that for you ie. don't
call the parseNumbers() method. Your other choice is to call parse
numbers() and during tokenization, if you get a token with TT_WORD type, try
to parse it using the Double routines and see if you get a valid number. If
you do, use it. It is inelegant but it should work.

Roberto Speranza
President, Dot Internet Solutions Inc.
mailto:robert@dotinc.net
http://www.dotinc.net/
----- Original Message -----
From: Steve Pietrowicz <srp@NCSA.UIUC.EDU>
To: <JAVA3D-INTEREST@JAVA.SUN.COM>
Sent: November 15, 1999 5:21 PM
Subject: Re: [JAVA3D] reading arrays of exponential floating point numbers

> I ran into this problem a long time ago. StreamTokenizer can't be
subclassed,
> because of the way the class is set up. A bug has been submitted to Sun
about
> this, but I haven't heard anything. It's been over a year.
>
> Anyway, my ReaderTokenizer class reads in numbers in exponential floating
> point, and works just like StreamTokenizer (although I've added some
> extensions, like hexadecimal). It doesn't allocate memory inside the loop
> (unless it goes over the internal buffer size, which is pretty large, and
> something I've never seen happen) or use deprecated classes. We use it
for
> most of our model loaders. See the .signature file for the location of
NCSA
> Portfolio. Hope this helps, and would be interested in hearing if this
works
> for you (or if it doesn't).
>
> Steve
>
>
> Martin Baker wrote:
>
> > I have a supplementary question to a previous question about reading
data.
> >
> > My question is: what is the most efficient way to read in large arrays
of
> > numbers? (such as a very big IndexedFaceSet)
> >
> > I think its relevant to this list because the loading time is proving a
> > limitation on the size of mesh that I can use. I'd appreciate any help
in
> > speeding up the loading of my program at www.euclideanspace.com
> >
> > I get the impression that the most 'architecturally sound' way to do
> > formatted I/O is to use the java.text package. But that's no good
because
> > its hopelessly inefficient and does not support exponential floating
point
> > numbers. (Are the java.text programmers aware of the needs of Java3D ?
is it
> > on their wish list?)
> >
> > So I tend to read the file one line at a time, use StringTokenizer to
split
> > it up, then convert to a double floating point number using something
like:
> >
> > value = (new Double(s)).doubleValue();
> >
> > But this is very slow because it does a memory allocation for every
number
> > read, so when I read in a big IFS the program slows down and starts
garbage
> > collecting and I have to wait ages for it to load.
> >
> > Would anyone like to suggest the fastest code to load and array of
floating
> > point numbers, which does not allocate memory inside the loop, or use
> > deprecated classes.
> >
> > Martin
> >

---------------------------------------------------------------------------

ran into this problem a long time ago. StreamTokenizer can't be subclassed,
because of the way the class is set up. A bug has been submitted to Sun about
this, but I haven't heard anything. It's been over a year.

Anyway, my ReaderTokenizer class reads in numbers in exponential floating
point, and works just like StreamTokenizer (although I've added some
extensions, like hexadecimal). It doesn't allocate memory inside the loop
(unless it goes over the internal buffer size, which is pretty large, and
something I've never seen happen) or use deprecated classes. We use it for
most of our model loaders. See the .signature file for the location of NCSA
Portfolio. Hope this helps, and would be interested in hearing if this works
for you (or if it doesn't).

Steve

Martin Baker wrote:

> I have a supplementary question to a previous question about reading data.
>
> My question is: what is the most efficient way to read in large arrays of
> numbers? (such as a very big IndexedFaceSet)
>
> I think its relevant to this list because the loading time is proving a
> limitation on the size of mesh that I can use. I'd appreciate any help in
> speeding up the loading of my program at www.euclideanspace.com
>
> I get the impression that the most 'architecturally sound' way to do
> formatted I/O is to use the java.text package. But that's no good because
> its hopelessly inefficient and does not support exponential floating point
> numbers. (Are the java.text programmers aware of the needs of Java3D ? is it
> on their wish list?)
>
> So I tend to read the file one line at a time, use StringTokenizer to split
> it up, then convert to a double floating point number using something like:
>
> value = (new Double(s)).doubleValue();
>
> But this is very slow because it does a memory allocation for every number
> read, so when I read in a big IFS the program slows down and starts garbage
> collecting and I have to wait ages for it to load.
>
> Would anyone like to suggest the fastest code to load and array of floating
> point numbers, which does not allocate memory inside the loop, or use
> deprecated classes.
>
> Martin
>
> ===========================================================================
> To unsubscribe, send email to listserv@java.sun.com and include in the body
> of the message "signoff JAVA3D-INTEREST". For general help, send email to
> listserv@java.sun.com and include in the body of the message "help".

--
Steve Pietrowicz - srp@ncsa.uiuc.edu

NCSA Portfolio 1.3 beta 3: http://www.ncsa.uiuc.edu/~srp/Java3D/portfolio/
New Loaders, turn your Canvas3D into a JPEG, new InputDevices and more!
Freely available for non-commercial use!
You Build It VR: http://www.ncsa.uiuc.edu/~srp/Java3D/YouBuildItVR/
Build your own multi-user virtual worlds with no programming experience!
The Java3D FAQ: http://tintoy.ncsa.uiuc.edu/~srp/java3d/faq.html
Astro3D: Java 3D Astronomy - http://www.ncsa.uiuc.edu/~srp/Java3D/Astro3D/

-------------------------------------------------------------------------

At 10:26 PM 11/15/99 -0000, you wrote:
>please could you explain the following line of your code to me -
>n = // brute-force parse perl regex /-?\d*\.\d*[eE]\d*/
>
>I guess its perl, what is the java equivalent?

"an optional '-' sign, followed by zero or more digits, followed by
a '.', followed by zero or more digits, followed by either 'e' or 'E',
followed by zero or more digits."

e.g. 1.08e43
2323234234234.23
-35.

Actually, now that I think about it, it needs to be a bit more
subtle; you can leave out the '.' and everything following it.

The idea is to read that and compute the floating point number.
It's not that hard. -Tim