Logo
 Home | Sitemap | Contact us | Search | Language
Left Right
Home >> Biotechnology and Genomics >> Bioinformatics and Data Mining - In Silico Biology >>Data Mining

Data Mining

Data mining has been defined as the non-trivial extraction of implicit, previously unknown and potentially useful information from data. It should not be confused with bioinformatics which is more focused on sequence based extractions of specific patterns or motifs and also on specific pattern matching. Data mining compliments and expands bioinformatics, so that bioinformatics and data mining, are distinct although eventually they may merge together. (Eventually data mining and bioinformatics will be indistinguishable, but for the time being they are distinct).

Although data mining is practised in biotechnology involving different branches of life sciences, health care and agriculture, its major use can be seen in order like marketing, manufacturing, database provides, government, travel industry, banking and financial institutions, telecommunications and engineering. In all these areas, massive information is available and to maximize the usefulness of this information, a variety of software are used, some of which were earlier discussed in this chapter

Among others, biopharmaceutical industry is employing a variety of data mining methods, because databases are flooded with information like the following: (i) annotated databases of the disease profiles; (ii) molecular pathways involved in disease; (iii) structure-activity relationship (SAR); (iv) chemical structures of combinational libraries of compounds; (v) results of clinical trials. Data mining is used to help the industry utilize this information fruitfully

Since lot of data is generated in biopharmaceutical industry, decision about the targets and lead compounds to be developed further is a major challenge. Data mining helps to make sense of these complex data sets in an intuitive and efficient manner. Many companies provide data mining services for a variety of purpose. There are six major approaches of data mining applications: (i) Influence-based mining – search for cause and effect relationships between data sets – pharmacogenomics. (ii) Affinity based mining – datamining system identifies data points or sets that tend to be grouped together; the approach is useful, and is important to distinguish “accidental/incidental” motifs from those of biological significance. (iii) Time-day data mining – look for patterns that are combined or rejected, as data set increases in future. (iv) Trends-based data mining – changes are studied that occur in specific data sets over time (trends are examined). (v) Comparative data mining- data collected at different sites over different time periods are compared to detect dissimilarities. (vi) Predicitive data mining – it complements and expands traditional bioinformatics

 

Left Right