This is the age of big data. Very large data sets from diverse sources and in diverse formats, that are too large for traditional relational database analysis, are subsumed under this title. Advanced data analysis uses techniques like text analytics, predictive analytics, data mining, and natural language processing to analyze big data. Big data analysis can be used by businesses, government agencies and researchers to make better decisions. Uses of big data analysis include functions like financial compliance and identity resolution applications.
Big data defies traditional database analysis
Big data is here and it’s getting bigger by the minute. Its sources are mostly digital, such as social media, networks, sensors, devices, video/audio, logs, transactional applications, and the web. It is high volume, high velocity and high variety. Big data can be structured, semi-structured or unstructured, and files exist in different sizes.
About 1.7 MB of new digital information is created for every person every day. The IDC reports that by 2020, the size of the digital universe will be more than 40 ZB. At present, only about 1% of the total data is analyzed, whether by companies, government agencies or researchers. New techniques of big data analysis can help governments, business and researchers make better choices, based on the vast amounts of digital data being created in real time.
Types of big data analysis
Given the high volume of big data and its velocity of creation in real time, traditional relational databases are unable to analyze these. Further, big data is derived from different sources and exists in different formats. New techniques for data analysis include:
- Text analytics
- Data mining
- Entity extraction
- Sentiment analysis
- Predictive analysis
- Natural language processing
These can handle structured and unstructured data as well as unstructured text. Big data analysis helps decision makers in businesses, government and research to make better choices. Big data has applications in fields like financial compliance, where banks and other financial institutions need to screen employees and customers.
Identity resolution applications of big data analysis
Identity resolution applications search data sets and databases to locate and analyze information regarding and identity in order to resolve or match an identity. The software carries out a series of algorithms, probability and scoring to resolve and match identities. The three primary tasks in entity resolution are deduplication, record linkage, and canonicalization.
These can be used for a variety of purposes including detecting identity theft and fraud. They can also be used to resolve and rationalize databases for customer data integration (CDI) and master data management (MDM).
Big data is high volume and high velocity digital information being created in real time. It exists in different formats and is derived from different sources. This makes it impossible for traditional relational databases to analyze big data, and new techniques are used. Big data analysis has numerous uses, including financial compliance and identity resolution applications.