It would be interesting to repeat this experiment with different

character encodings and see if using UTF-16 versus UTF-8 makes a

difference here.

With UTF-16 there are twice as many byte to read, but the 50% magic

ratio still prevails.

best 5 five trials shown.

Using a random sample data file of 209,715,200 chars 419,430,402

bytes.

Using aggregate buffersize of 65,536 bytes.

Using charset UTF-16

BufferedReader backed with BufferedInputStream ratio 0.10 buffsize

65536 bytes 2.64 seconds

BufferedReader backed with BufferedInputStream ratio 0.20 buffsize

65536 bytes 2.63 seconds

BufferedReader backed with BufferedInputStream ratio 0.30 buffsize

65536 bytes 2.64 seconds

BufferedReader backed with BufferedInputStream ratio 0.40 buffsize

65536 bytes 2.64 seconds

BufferedReader backed with BufferedInputStream ratio 0.50 buffsize <--

65536 bytes 2.64 seconds

BufferedReader backed with BufferedInputStream ratio 0.60 buffsize

65536 bytes 2.68 seconds

BufferedReader backed with BufferedInputStream ratio 0.70 buffsize

65536 bytes 2.71 seconds

BufferedReader backed with BufferedInputStream ratio 0.80 buffsize

65536 bytes 2.79 seconds

BufferedReader backed with BufferedInputStream ratio 0.90 buffsize

65536 bytes 2.70 seconds

HunkIO 2.70 seconds

Using a random sample data file of 209,715,200 chars 209,715,200

bytes.

Using aggregate buffersize of 65,536 bytes.

Using charset UTF-8

HunkIO 0.73 seconds

BufferedReader backed with BufferedInputStream ratio 0.10 buffsize

65536 bytes 0.89 seconds

BufferedReader backed with BufferedInputStream ratio 0.20 buffsize

65536 bytes 0.88 seconds

BufferedReader backed with BufferedInputStream ratio 0.30 buffsize

65536 bytes 0.88 seconds

BufferedReader backed with BufferedInputStream ratio 0.40 buffsize

65536 bytes 0.89 seconds

BufferedReader backed with BufferedInputStream ratio 0.50 buffsize <--

65536 bytes 0.83 seconds

BufferedReader backed with BufferedInputStream ratio 0.60 buffsize

65536 bytes 0.91 seconds

BufferedReader backed with BufferedInputStream ratio 0.70 buffsize

65536 bytes 0.92 seconds

BufferedReader backed with BufferedInputStream ratio 0.80 buffsize

65536 bytes 0.96 seconds

BufferedReader backed with BufferedInputStream ratio 0.90 buffsize

65536 bytes 0.92 seconds

There is something strange. UTF-16 should be faster to convert to

Strings, and at worst, taking twice as long for physical I/O.