public final class MurmurHash3 extends Object
MurmurHash is a non-cryptographic hash function suitable for general hash-based lookup. The name comes from two basic operations, multiply (MU) and rotate (R), used in its inner loop. Unlike cryptographic hash functions, it is not specifically designed to be difficult to reverse by an adversary, making it unsuitable for cryptographic purposes.
This contains a Java port of the 32-bit hash function MurmurHash3_x86_32
and the 128-bit hash function
MurmurHash3_x64_128
from Austin Applyby's original c++
code in SMHasher.
This is public domain code with no copyrights. From home page of SMHasher:
"All MurmurHash versions are public domain software, and the author disclaims all copyright to their code."
Original adaption from Apache Hive. That adaption contains a hash64
method that is not part of the original
MurmurHash3 code. It is not recommended to use these methods. They will be removed in a future release. To obtain a
64-bit hash use half of the bits from the hash128x64
methods using the input data converted to bytes.
Modifier and Type | Class and Description |
---|---|
static class |
MurmurHash3.IncrementalHash32
Deprecated.
Use IncrementalHash32x86. This corrects the processing of trailing bytes.
|
static class |
MurmurHash3.IncrementalHash32x86
Generates 32-bit hash from input bytes.
|
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_SEED
A default seed to use for the murmur hash algorithm.
|
static long |
NULL_HASHCODE
Deprecated.
This is not used internally and will be removed in a future release.
|
Modifier and Type | Method and Description |
---|---|
static long[] |
hash128(byte[] data)
Generates 128-bit hash from the byte array with a default seed.
|
static long[] |
hash128(byte[] data,
int offset,
int length,
int seed)
Deprecated.
Use
hash128x64(byte[], int, int, int) . This corrects the seed initialization. |
static long[] |
hash128(String data)
Deprecated.
Use
hash128x64(byte[]) using the bytes returned from
String.getBytes(java.nio.charset.Charset) . |
static long[] |
hash128x64(byte[] data)
Generates 128-bit hash from the byte array with a seed of zero.
|
static long[] |
hash128x64(byte[] data,
int offset,
int length,
int seed)
Generates 128-bit hash from the byte array with the given offset, length and seed.
|
static int |
hash32(byte[] data)
Deprecated.
Use
hash32x86(byte[], int, int, int) . This corrects the processing of trailing bytes. |
static int |
hash32(byte[] data,
int length)
Deprecated.
Use
hash32x86(byte[], int, int, int) . This corrects the processing of trailing bytes. |
static int |
hash32(byte[] data,
int length,
int seed)
Deprecated.
Use
hash32x86(byte[], int, int, int) . This corrects the processing of trailing bytes. |
static int |
hash32(byte[] data,
int offset,
int length,
int seed)
Deprecated.
Use
hash32x86(byte[], int, int, int) . This corrects the processing of trailing bytes. |
static int |
hash32(long data)
Generates 32-bit hash from a long with a default seed value.
|
static int |
hash32(long data,
int seed)
Generates 32-bit hash from a long with the given seed.
|
static int |
hash32(long data1,
long data2)
Generates 32-bit hash from two longs with a default seed value.
|
static int |
hash32(long data1,
long data2,
int seed)
Generates 32-bit hash from two longs with the given seed.
|
static int |
hash32(String data)
Deprecated.
Use
hash32x86(byte[], int, int, int) with the bytes returned from
String.getBytes(java.nio.charset.Charset) . This corrects the processing of trailing bytes. |
static int |
hash32x86(byte[] data)
Generates 32-bit hash from the byte array with a seed of zero.
|
static int |
hash32x86(byte[] data,
int offset,
int length,
int seed)
Generates 32-bit hash from the byte array with the given offset, length and seed.
|
static long |
hash64(byte[] data)
Deprecated.
Not part of the MurmurHash3 implementation.
Use half of the hash bytes from
hash128x64(byte[]) . |
static long |
hash64(byte[] data,
int offset,
int length)
Deprecated.
Not part of the MurmurHash3 implementation.
Use half of the hash bytes from
hash128x64(byte[], int, int, int) . |
static long |
hash64(byte[] data,
int offset,
int length,
int seed)
Deprecated.
Not part of the MurmurHash3 implementation.
Use half of the hash bytes from
hash128x64(byte[], int, int, int) . |
static long |
hash64(int data)
Deprecated.
Not part of the MurmurHash3 implementation.
Use half of the hash bytes from
hash128x64(byte[]) with the bytes from the int . |
static long |
hash64(long data)
Deprecated.
Not part of the MurmurHash3 implementation.
Use half of the hash bytes from
hash128x64(byte[]) with the bytes from the long . |
static long |
hash64(short data)
Deprecated.
Not part of the MurmurHash3 implementation.
Use half of the hash bytes from
hash128x64(byte[]) with the bytes from the short . |
@Deprecated public static final long NULL_HASHCODE
public static final int DEFAULT_SEED
104729
.public static int hash32(long data1, long data2)
int offset = 0; int seed = 104729; int hash = MurmurHash3.hash32x86(ByteBuffer.allocate(16) .putLong(data1) .putLong(data2) .array(), offset, 16, seed);
data1
- The first long to hashdata2
- The second long to hashhash32x86(byte[], int, int, int)
public static int hash32(long data1, long data2, int seed)
int offset = 0; int hash = MurmurHash3.hash32x86(ByteBuffer.allocate(16) .putLong(data1) .putLong(data2) .array(), offset, 16, seed);
data1
- The first long to hashdata2
- The second long to hashseed
- The initial seed valuehash32x86(byte[], int, int, int)
public static int hash32(long data)
int offset = 0; int seed = 104729; int hash = MurmurHash3.hash32x86(ByteBuffer.allocate(8) .putLong(data) .array(), offset, 8, seed);
data
- The long to hashhash32x86(byte[], int, int, int)
public static int hash32(long data, int seed)
int offset = 0; int hash = MurmurHash3.hash32x86(ByteBuffer.allocate(8) .putLong(data) .array(), offset, 8, seed);
data
- The long to hashseed
- The initial seed valuehash32x86(byte[], int, int, int)
@Deprecated public static int hash32(byte[] data)
hash32x86(byte[], int, int, int)
. This corrects the processing of trailing bytes.int offset = 0; int seed = 104729; int hash = MurmurHash3.hash32(data, offset, data.length, seed);
This implementation contains a sign-extension bug in the finalization step of any bytes left over from dividing the length by 4. This manifests if any of these bytes are negative.
data
- The input byte arrayhash32(byte[], int, int, int)
@Deprecated public static int hash32(String data)
hash32x86(byte[], int, int, int)
with the bytes returned from
String.getBytes(java.nio.charset.Charset)
. This corrects the processing of trailing bytes.Before 1.14 the string was converted using default encoding. Since 1.14 the string is converted to bytes using UTF-8 encoding.
This is a helper method that will produce the same result as:int offset = 0; int seed = 104729; byte[] bytes = data.getBytes(StandardCharsets.UTF_8); int hash = MurmurHash3.hash32(bytes, offset, bytes.length, seed);
This implementation contains a sign-extension bug in the finalization step of any bytes left over from dividing the length by 4. This manifests if any of these bytes are negative.
data
- The input stringhash32(byte[], int, int, int)
@Deprecated public static int hash32(byte[] data, int length)
hash32x86(byte[], int, int, int)
. This corrects the processing of trailing bytes.int offset = 0; int seed = 104729; int hash = MurmurHash3.hash32(data, offset, length, seed);
This implementation contains a sign-extension bug in the finalization step of any bytes left over from dividing the length by 4. This manifests if any of these bytes are negative.
data
- The input byte arraylength
- The length of arrayhash32(byte[], int, int, int)
@Deprecated public static int hash32(byte[] data, int length, int seed)
hash32x86(byte[], int, int, int)
. This corrects the processing of trailing bytes.int offset = 0; int hash = MurmurHash3.hash32(data, offset, length, seed);
This implementation contains a sign-extension bug in the finalization step of any bytes left over from dividing the length by 4. This manifests if any of these bytes are negative.
data
- The input byte arraylength
- The length of arrayseed
- The initial seed valuehash32(byte[], int, int, int)
@Deprecated public static int hash32(byte[] data, int offset, int length, int seed)
hash32x86(byte[], int, int, int)
. This corrects the processing of trailing bytes.This is an implementation of the 32-bit hash function MurmurHash3_x86_32
from from Austin Applyby's original MurmurHash3 c++
code in SMHasher.
This implementation contains a sign-extension bug in the finalization step of any bytes left over from dividing the length by 4. This manifests if any of these bytes are negative.
data
- The input byte arrayoffset
- The offset of datalength
- The length of arrayseed
- The initial seed valuepublic static int hash32x86(byte[] data)
int offset = 0; int seed = 0; int hash = MurmurHash3.hash32x86(data, offset, data.length, seed);
data
- The input byte arrayhash32x86(byte[], int, int, int)
public static int hash32x86(byte[] data, int offset, int length, int seed)
This is an implementation of the 32-bit hash function MurmurHash3_x86_32
from from Austin Applyby's original MurmurHash3 c++
code in SMHasher.
data
- The input byte arrayoffset
- The offset of datalength
- The length of arrayseed
- The initial seed value@Deprecated public static long hash64(long data)
hash128x64(byte[])
with the bytes from the long
.This is not part of the original MurmurHash3 c++
implementation.
This is a Murmur3-like 64-bit variant.
The method does not produce the same result as either half of the hash bytes from
hash128x64(byte[]) with the same byte data from the long
.
This method will be removed in a future release.
Note: The sign extension bug in hash64(byte[], int, int, int)
does not effect
this result as the default seed is positive.
This is a helper method that will produce the same result as:
int offset = 0; int seed = 104729; long hash = MurmurHash3.hash64(ByteBuffer.allocate(8) .putLong(data) .array(), offset, 8, seed);
data
- The long to hashhash64(byte[], int, int, int)
@Deprecated public static long hash64(int data)
hash128x64(byte[])
with the bytes from the int
.This is not part of the original MurmurHash3 c++
implementation.
This is a Murmur3-like 64-bit variant.
The method does not produce the same result as either half of the hash bytes from
hash128x64(byte[]) with the same byte data from the int
.
This method will be removed in a future release.
Note: The sign extension bug in hash64(byte[], int, int, int)
does not effect
this result as the default seed is positive.
This is a helper method that will produce the same result as:
int offset = 0; int seed = 104729; long hash = MurmurHash3.hash64(ByteBuffer.allocate(4) .putInt(data) .array(), offset, 4, seed);
data
- The int to hashhash64(byte[], int, int, int)
@Deprecated public static long hash64(short data)
hash128x64(byte[])
with the bytes from the short
.This is not part of the original MurmurHash3 c++
implementation.
This is a Murmur3-like 64-bit variant.
The method does not produce the same result as either half of the hash bytes from
hash128x64(byte[]) with the same byte data from the short
.
This method will be removed in a future release.
Note: The sign extension bug in hash64(byte[], int, int, int)
does not effect
this result as the default seed is positive.
This is a helper method that will produce the same result as:
int offset = 0; int seed = 104729; long hash = MurmurHash3.hash64(ByteBuffer.allocate(2) .putShort(data) .array(), offset, 2, seed);
data
- The short to hashhash64(byte[], int, int, int)
@Deprecated public static long hash64(byte[] data)
hash128x64(byte[])
.This is not part of the original MurmurHash3 c++
implementation.
This is a Murmur3-like 64-bit variant. The method does not produce the same result as either half of the hash bytes from hash128x64(byte[]) with the same byte data. This method will be removed in a future release.
Note: The sign extension bug in hash64(byte[], int, int, int)
does not effect
this result as the default seed is positive.
This is a helper method that will produce the same result as:
int offset = 0; int seed = 104729; long hash = MurmurHash3.hash64(data, offset, data.length, seed);
data
- The input byte arrayhash64(byte[], int, int, int)
@Deprecated public static long hash64(byte[] data, int offset, int length)
hash128x64(byte[], int, int, int)
.This is not part of the original MurmurHash3 c++
implementation.
This is a Murmur3-like 64-bit variant. The method does not produce the same result as either half of the hash bytes from hash128x64(byte[]) with the same byte data. This method will be removed in a future release.
Note: The sign extension bug in hash64(byte[], int, int, int)
does not effect
this result as the default seed is positive.
This is a helper method that will produce the same result as:
int seed = 104729; long hash = MurmurHash3.hash64(data, offset, length, seed);
data
- The input byte arrayoffset
- The offset of datalength
- The length of arrayhash64(byte[], int, int, int)
@Deprecated public static long hash64(byte[] data, int offset, int length, int seed)
hash128x64(byte[], int, int, int)
.This is not part of the original MurmurHash3 c++
implementation.
This is a Murmur3-like 64-bit variant. This method will be removed in a future release.
This implementation contains a sign-extension bug in the seed initialization. This manifests if the seed is negative.
This algorithm processes 8 bytes chunks of data in a manner similar to the 16 byte chunks
of data processed in the MurmurHash3 MurmurHash3_x64_128
method. However the hash
is not mixed with a hash chunk from the next 8 bytes of data. The method will not return
the same value as the first or second 64-bits of the function
hash128(byte[], int, int, int)
.
Use of this method is not advised. Use the first long returned from
hash128x64(byte[], int, int, int)
.
data
- The input byte arrayoffset
- The offset of datalength
- The length of arrayseed
- The initial seed valuepublic static long[] hash128(byte[] data)
int offset = 0; int seed = 104729; int hash = MurmurHash3.hash128(data, offset, data.length, seed);
Note: The sign extension bug in hash128(byte[], int, int, int)
does not effect
this result as the default seed is positive.
data
- The input byte arrayhash128(byte[], int, int, int)
public static long[] hash128x64(byte[] data)
int offset = 0; int seed = 0; int hash = MurmurHash3.hash128x64(data, offset, data.length, seed);
data
- The input byte arrayhash128x64(byte[], int, int, int)
@Deprecated public static long[] hash128(String data)
hash128x64(byte[])
using the bytes returned from
String.getBytes(java.nio.charset.Charset)
.Before 1.14 the string was converted using default encoding. Since 1.14 the string is converted to bytes using UTF-8 encoding.
This is a helper method that will produce the same result as:int offset = 0; int seed = 104729; byte[] bytes = data.getBytes(StandardCharsets.UTF_8); int hash = MurmurHash3.hash128(bytes, offset, bytes.length, seed);
Note: The sign extension bug in hash128(byte[], int, int, int)
does not effect
this result as the default seed is positive.
data
- The input Stringhash128(byte[], int, int, int)
@Deprecated public static long[] hash128(byte[] data, int offset, int length, int seed)
hash128x64(byte[], int, int, int)
. This corrects the seed initialization.This is an implementation of the 128-bit hash function MurmurHash3_x64_128
from from Austin Applyby's original MurmurHash3 c++
code in SMHasher.
This implementation contains a sign-extension bug in the seed initialization. This manifests if the seed is negative.
data
- The input byte arrayoffset
- The first element of arraylength
- The length of arrayseed
- The initial seed valuepublic static long[] hash128x64(byte[] data, int offset, int length, int seed)
This is an implementation of the 128-bit hash function MurmurHash3_x64_128
from from Austin Applyby's original MurmurHash3 c++
code in SMHasher.
data
- The input byte arrayoffset
- The first element of arraylength
- The length of arrayseed
- The initial seed valueCopyright © 2002–2020 The Apache Software Foundation. All rights reserved.