An improved indexing method for querying big XML files

The exponential growth of bioinformatics in the healthcare domain has revolutionized our understanding of DNA, proteins, and other biomolecular entities. This remarkable progress has generated an overwhelming volume of data, necessitating big data technologies for efficient storage and indexing. While big data technologies like Hadoop offer substantial support for big XML file storage, the challenges of indexing data sizes and XPath query performance persist.