plotting 32 byte hex numbers
19 views (last 30 days)
Show older comments
how big of numbers can I plot?
I have a bunch of 32 byte HEX numbers I need to plot.
Can I plot them or will they be too big?
I know I will need to convert them to decimal.
example:
3BA3EDFD7A7B12B27AC72C3E67768F617FC81BC3888A51323A9FB8AA4B1E5E4A
2 Comments
James Tursa
on 23 Feb 2018
Edited: James Tursa
on 23 Feb 2018
Plot in what way? What type of plot would you expect for your example above? What do these hex values represent?
Jan
on 24 Feb 2018
[MOVED from section for answers - please post comments as comments]
George Davey wrote: They are 32 byte hashes and I want to plot them to look for value densities on a 1D plot.
Answers (5)
Jan
on 24 Feb 2018
Edited: Jan
on 24 Feb 2018
As far as I understand you want to compute 10^8 hash values with 192 bits, convert them to decimals, and draw them as pixels on a line (or rectangle?) to check, if they are equally distributed. To make the points distinguishable, neighboring hash values need a distance of about 0.1 mm. To draw all possible hash values on paper, it needs a dimension of 0.0001 * 2^(8*32) m. This is 11.5 * 10^72 m. The distance from earth to sun is only 149.6 * 10^9 m. Our galaxy has a diameter of 100'000 light years a 9.46 * 10^15 m. The observable universe has a diameter of 4.4 * 10^26 m. This means that even a resolution in the subatomic scale and an intergalactic display is not sufficient to detect any clustering in the least significant 10 bytes.
Plotting the hash values on a 42'' or 600'' display will mean a massive rounding and a very rough view on the actual data. There is no need to use more than 15 significant decimal digits for this displaying, because even this is ways finer as you can plot on any existing paper or screen.
Do I really miss the point here?
I do not understand, how you want to draw the 2nd dimension. Your wrote "the y coordinate will be 100000000000000000000 for all to scale". If all y values will be the same, why not using 1.0? But even with a 2D plot the universe will be too small for a meaningful resolution.
The mentioned Diehard, Dieharder and TestU01 tests are created to assess the independence of pseudo-random number generators. They try to detect any dependencies between subsequent values, bit patterns, unequal distributions and clustering. They are designed to check a stream of 32 bit values only and I do not know, how to apply them to 192 bit hashes. Hash values can be compared with pseudo-random number generators, so you can e.g. use AES as an RNG.
Testing only 10^8 hash values of 10^72 possibilities gives an extremely tiny view on the full set. If you find a cluster of 1000 hash values, this has only a very weak explanatory power concerning the equal distribution of the complete set of hashes.
0 Comments
Jan
on 24 Feb 2018
Edited: Jan
on 24 Feb 2018
It is still not clear what you want to plot. Do you want to convert the hex string to an integer value in the range [0 2^(8*32)-1] and then plot it? This is in the range of 1.5e77. Of course you can plot this. Simply try it:
y = hex2dec('3BA3EDFD7A7B12B27AC72C3E67768F617FC81BC3888A51323A9FB8AA4B1E5E4A');
plot(y, '*');
The question is, what you can see in this plot. If you want to check, if the hash values are equally distributed, this will work for the first 2 or 3 bytes only, but you cannot convert such huge numbers accurately to doubles, which contain about 15 significant digits only.
Please explain, which problem you want to solve. I assume there is a better solution than plotting.
0 Comments
George Davey
on 24 Feb 2018
Edited: George Davey
on 24 Feb 2018
1 Comment
Jan
on 24 Feb 2018
A line with "no clumping", but 1e8 points? Think twice. Nobody will be able to examine this line in detail. Either the visualization is too rough to reveal the details, or it contains far too much details to be understood by a human. I assume you can observe a "clumping" for about 1000 or 10'000 dots, but not for 100 millions.
If you convert a 32 byte value to a double, you must observe an non equal distribution, because the only 15 valid digits are stored in a double. Therefore massive rounding effects will remove all information from the lower significant bits. A conversion to decimal will not be useful.
It will be more useful to examine the density distribution of the hash values statistically. The Diehard test might be useful, or Dieharder.
Walter Roberson
on 24 Feb 2018
"What my question is will I be able to convert a 32 byte integer to decimal with matlab? "
Yes, it is possible using the Symbolic Toolbox.
However, that is a different question than whether you can plot those in any useful sense.
For 10^8 pixels you might want to consider creating a 10000 x 10000 binary image and printing that.
You might also want to do statistical analyses, such as barcharts of the number of entries that were divisible by a set of primes (you might want to normalize by the number of expected entries if the values were random.)
6 Comments
Walter Roberson
on 24 Feb 2018
No, excel uses IEEE 754 double precision, so you cannot use unlimited values in Excel.
Walter Roberson
on 25 Feb 2018
It is not possible to plot() a value beyond realmax('double') which is roughly 1.79769313486232e+308 . If you try to plot() a sym number, MATLAB tries to convert it to double for plotting.
George Davey
on 24 Feb 2018
5 Comments
Jan
on 26 Feb 2018
Edited: Jan
on 26 Feb 2018
The HEX string is an established representation of long numbers already. Sorry for repeating myself another time: If you plot standard IEEE754 doubles with about 15 significant digits on a 600'' (15.24 m) paper, you get 6.56 * 10^10 different values per mm. This is much more than any printer can address or a human eye could register. Therefore asking for a higher precision than the conversion to doubles is not needed. hex2dec is sufficient already, even if you print it on something of the size of an average imperial class-II star destroyer.
For a visual test it is enough to work with a precision, which can be visualized and visually received. 10 dots per mm seems to be a limit, so on a 15.24 m display you can plot 1.524e5 dots. The highest 18 bits can be visualized, while the rest of the 192 bits of this hash is rounded away. You will not see a significantly larger precision than an a A4 paper. Therefore I would save the money for the huge plotter.
Walter Roberson
on 26 Feb 2018
Ah, but if you are going that big, you get a certain economy of scale ;-)
See Also
Categories
Find more on Logical in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!