Hashing separates the elements of a database sent over the networks and then reconstructs them. It generates shorter codes than the original data, making it easier to source the items.
It also plays a key role in data collisions that can affect query performance. Of MIT researchers recently investigated whether AI can deliver better hash functions. You have found a way Accelerate data searches in huge databases.
The new hacking technique could be applied in many areas. It would be particularly useful for AI, computer graphics, bioinformatics, and compilers.
The model helps reduce data collisions by half
Standard hash functions are very random. It happens two pieces of data have the same hash value. consequently, Collisions occur when a user searches for an item. The user can also be redirected to multiple items have the same hash value. The research is therefore slower and less efficient.
The new AI-based hash model can sort data in a way that avoids collisions. The latter are reduced by half, and the calculations are much more accurate. The model can Reduction in collision rate by 30 to 15% between the keys of a record. The running time is also shortened 30%.
For example, this method could improve the storage and analysis of DNA, amino acid sequences and other biological data.
“What we discovered in this work is that in some situations we can find a better compromise between computing the hash function and the collisions we face. We can increase the computation time of the hash function a bit, but at the same time we can reduce collisions very significantly in certain situations. »
Ibrahim Sabek, postdoctoral fellow at MIT Data Systems
SEE ALSO: This AI can store many more memories
AI could further improve data processing performance
For this research, the scientists wanted to design Hash functions for different data types. In addition, they planExplore learned hashes for databases that can be added or deleted. When the data is updated, the model needs to be updated updated while maintaining its accuracy.
By studying the learned models, they find out the partial models have the greatest impact on reducing collision. They now want to create hash functions with the patterns they have learned for different data formats.
“We want to encourage the community to use machine learning in more fundamental data structures and operations. Any fundamental data structure gives us the opportunity to use machine learning to capture data properties and achieve better performance. There are still many things to discover. »
researcher